A spreadsheet is a collection matrices together with
some tools to visualize or the manipulate these data.
Open source "Excel" replacements like "gnumeric" or
"planmaker"
have the advantage that they do the math correctly, but they do not solve
conceptional problems of spreadsheet programs like Excel.
Virtually any real programming language allows to deal with data more
reliably and do that in a more modular way.
Here is an example: a course assistant keeps track of homework
grades in a text file "hw.txt"
|
# student HW grades for course "Ocean 12a"
# name first HW1 HW2 HW3 HW4 HW5
#
Clooney George 10 18 29 17 7
Pitt Brad 5 6 34 4 3
Damon Matt 13 1 22 0 10
Roberts Julia 9 19 27 7 6
Affleck Casey 16 14 15 3 13
Caan Scott 12 2 41 1 19
Qin Shaobo 9 28 28 16 7
Mac Bernie 5 15 34 4 2
Coltrane Robbie 13 11 21 19 10
Willis Bruce 8 28 27 7 5
|
|
Running the line (studentaverage.txt)
cat hw.txt|grep -v "#"|awk '{k=0;s=0; for (i=3;i<=NF;i++) {k++;s=s+$i;} print $1,$2,s/k}'
|
spits out a list of average homework scores for each student.
The unix tool "grep" filters out comments and "awk" does the math.
Average scores for HW1, HW2 of the entire class are obtained with
cat hw.txt|grep -v "#"|awk '{n1+=$3;n2+=$4;s++} END {print n1/s,n2/s}'
|
Here is a script "homeworkaverage.txt"
which gives the averages of each of the colums. These rather silly
examples can be part of a larger script which will produce the final grade.
It is more convenient to use Perl
or Python or even a computer algebra
system like Mathematica do such things.
But most of the time, especially in educational setups,
spreadsheet are used for trivial things like sorting,
taking averages or adding up weighted scores.
Often one has to merge different spreadsheets. Here is an example program
merge.perl written in Perl which accesses entries
in both matrices individually by its indices.
While it does the task not very efficiently, it allows a
"mathematically trained mind" to manipulate these matrices. Having a powerful
programming language at hand is especially useful when more complex row or
column manipulations have to be done, for example when entries have to be
checked for certain conditions and if there are exceptions.
Dealing with exceptions is especially difficult with "out of the shelf"
spreadsheet programs. Exceptions are also the main source for errors, when
dealing with data.
The two
postings in a slashdot
discussion
which hit the heart of the matter. Problems with the spreadsheet concept:
- Mixing of code with data makes it difficult to test and
audit things. How am I going to know from looking at a spread sheet
which data have been computed and which have been entered?
Splitting of code with data allows to test the code with lots of different
data. Testing a code with different (also extreme) data can
show problems with the code.
- No Modularization: is the main obstacle to maintain complex data.
The Unix philosophy
of building things up with independent building blocks or filters is more efficient
in the long run.
Each filter is typically small and can be developed and tested independently.
- Difficult to audit: how do we debug a complex spreadsheet written
by someone else, how do I proofread it? If I want to believe a statement,
I must be able to verify how it was derived or falsify it. This is very
difficult to achieve in general and pretty graphs do not help.
- Spagetti logic : entering formulas for cells just asks for
spaghetti coding. Problems come also from rounding errors which
build up.
|
|