Skip to content

Exercises on Data Structures

Alex edited this page Mar 8, 2018 · 3 revisions

Rough notes about data structures to look at, to see if it is worth while to make exercises about them.

There are basically four "families" of abstract data types, and a number of interesting data structures which are members of the families.

  1. Arrays
  2. Linked Structures, when arrays just aren't good enough
  3. Trees
  4. Graphs

There are, of course, abstract data types which do not fall under these four families. The exceptions are:

  • List
  • Stack
  • Queue
  • Set

Sequence of exercises with motivation

I suppose one approach is to notice how useful arrays are.

First, we move away from the concrete syntax of array[index] to the more abstract array_get(array, index) and array_set(array, index, value). (Exercise 1)

But it would be nice if the array could resize automatically as needed, which is precisely the dynamic array data structure. (Exercise 2)

We observe how nice it is to have an array, and treat the index as somehow meaningful (e.g., in a census program, the index could be the age in years, and the value would be the number of people at this age). It would be nice if we could let the index be structured, in the sense of making it a string or a custom data structure. This gives us association lists (Exercise 3)

But then we can do some formal analysis of the association list (exercise 4) and notice how terrible it is. This gives motivation for...trying anything else.

This gives a natural motivation for hash tables (Exercise 5), tries (exercise 6), and so forth (exercises 7 through infinity).

Remark: Sets for Free. Actually, what's really interesting is...once we allow "anything" to be used as the index ("key"), the array is called an "associative array" (to each "key" is an associated "value"), a "dictionary", or more generally "map". The interesting thing is we can use this to implement a set, by making the key equal to the value we guarantee there are no duplicate entries.

Structure of Exercises

So, whenever discussing a new data structure, the basic questions one should ask are:

  • What are the basic operations on it? (What are the signatures of those operations, i.e., what are the inputs and outputs?)
  • What are the axioms for those operations?

The next questions one should ask, when implementing the thing:

  • What does the struct look like? (What fields does it have, and what type are those fields?)
  • What functions implement the operations defined on the type?
  • Can we implement the axioms as preconditions/postconditions?

The first batch of questions (what operations are defined, and what are their axioms) are hard to jump into...even asking what the struct looks like, and what functions are defined on it is hard.

So it seems to me the best approach is to gradually get the student to think about such things. Initially, a generic exercise should look like: given the data structure

typedef struct { /* ... */ } foo

Please implement the following functions

  • foo* new_foo() creates a new foo object in memory, and returns a pointer to it (or null if there is no memory left)
  • void free_foo(foo *obj) will delete the foo object in memory, and have obj be a null pointer; if obj is null already, do nothing
  • char* foo_toString(foo *f) will produce a string representation of f
  • bool foo_equals(foo *lhs, foo *rhs) tests if the two foo objects are equal, intuitively if lhs == rhs; if either point is null, returns false always
  • hash_t foo_hashCode(foo *obj) returns a hash code for obj; null pointers return 0 by convention
  • etc.

Basically, give the function signature, and specify the contract expected. Perhaps add some more axioms, like

  • if foo_equals(x, y) then foo_hashCode(x) == foo_hashCode(y) (equivalent foo objects have equivalent hash codes)
  • if foo_equals(x, y) then strcmp(foo_toString(x), foo_toString(y)) == 0 (equal foo objects have identical string representations)
  • etc.

After a few of these, perhaps the struct details can be left to the reader to decide, but more explanation of the data structure be given. (Having diagrams like the ones used in the interpreter handbook may be really useful, too.)

Clone this wiki locally