Data Structures and Algorithms
Workshop 1 - 1999
Familiarisation
The basic aims of this workshop are to
- familiarise you with techniques to time an algorithm
written in ANSI C and
- experimentally confirm the time complexity of
addition and searching operations for various
data structures.
You will need to write a simple program to insert data into
a tree,
measuring the average time to add an item
and to find a randomly chosen item in the tree.
- Download and save into your directory from the download window:
- Collection.h
- Collection.c
- tree_struct.c
- tree_add.c
- tree_find.c and
- the test program tc.c.
- Modify the collection implementation so that it uses a tree
structure rather than the original array.
Edit out the original structure, find and add methods and load
in the new ones. Of course, you will be able to leave the
specification of the Collection class untouched.
- Compile into an executable:
gcc -o tc tc.c collection.c
Note that you will need to use an
ANSI C compiler as the
programs are written in ANSI C.
On some Unix machines, cc
only accepts the older "K&R C".
- Run the program and verify that it runs as expected.
Examine the test program listing to determine how it runs!
- Now we want to find out how efficiently it runs:
- Modify the test program so that it
generates and then inserts a large number, n, of
integers into the collection.
Note that you will need to be careful about the
choice of the set of numbers that you generate.
Compare what happens to your times when you use
the set of integers, 1 to n, to
when you use the
rand() function to generate
a set of random numbers to add.
Once you have created a collection with n
items in it,
determine how long it takes, on average, to find an item
in the collection.
Again you will need to generate a set of "probe" data which
you search for in the collection.
(Searching for the same item multiple times may cause problems
and give you misleading answers - why?)
Timing
Timing an individual function call has some traps:
read these notes for some guidelines.
- Determine the average time to search the collection for
a randomly generated integer -
again by finding the time to search for n
randomly generated integers.
Use the random number generator,
rand(),
to generate random integers.
- Modify the program so that it prints out the
insertion and searching times for a range of values of
n.
Suitable values will produce run times between about
1 and 20 seconds. About 10 values should enable you to determine the
characteristics of the time vs n curve.
Warning: Note carefully that the test program,
tc.c is a test program -
designed to demonstrate that the collection code is correct.
You will need to re-organise it quite a bit to perform the
timing analyses efficiently.
Report
Prepare a brief report which summarises your results.
(The report should be plain ASCII text -
not the native form of
any word processor.)
This report should start by forming a hypothesis about your
expected results.
This should be followed by the actual results.
The conclusion should provide a convincing argument as to why these
results confirm your original hypotheses.
It should also highlight and attempt to explain any discrepancies.
You are expected to measure the addition and
searching times.
Your report should also discuss the difference (if any)
in results observed
when a sequence of integers, 1, 2, .. is used for the test data
compared to a randomly chosen list of integers.
If you design your program efficiently, you will be able to get
it to generate a table of data for you which you can paste
directly into the report.
Submission Instructions will be available soon!