SICP 3.1.2 Exercises

May 28, 2017 11:28

While this section is mainly about the benefits of assignment, once we start to test our procedures, one drawback is almost immediately apparent. The benefit we’re getting from assignment is that we can easily have functions where the output can vary. The problem for testing is that we’d rather have some expected value to check it against, which we can’t do if the output is unpredictable.

It might seem that one solution would be to see if the result is in some range, or allow for an approximate answer. This is not necessarily reliable, however, when dealing with random values. In many cases there is a small chance of the value being outside the expected range, even when working as intended. This might result in a test that only passes, say, 99.9% of the time. When it fails, however, we cannot know if that is due to some introduced bug, or simply because we hit the 0.1% chance that time.

Such systems are thus inherently impossible to perfectly test for correctness. This doesn’t mean there is no place for randomness in testing, or that we should abandon the attempt. In fact, the second exercise mentions using a reset for random-number generators Technically these are pseudorandom numbers in order to get a repeatable list of values, which is quite useful in exercising a system that has the ability to take a broad range of inputs. The arithmetic system from last chapter would be one example. Using a resettable list allows us to repeat the test exactly, to see if a possible bug persists when the code is changed. Even though the testing approach is incomplete, it can sometimes find limitations of the system that might otherwise go undetected.

Since the book doesn’t fully explain how to implement rand (or rand-update specifically), I’ll recommend that if you are using Dr. Racket’s “Pretty Big” (as I do) or MIT Scheme, you can use the random procedure that is built-in. Racket’s random does not work as required for random-in-range, however, so I’ve included an alternate implementation in the Exercise file. Consult the documentation for random to learn how to set a particular state. For Exercise 3.6, you may need to learn more about how pseudorandom numbers are typically generated, though the footnote from the text discussing rand-update gives an idea.

In the book, it is suggested that part of Exercise 3.5 is determining how to apply the integral test in order to check the value of pi. For this reason, the tests are incomplete in the exercise file. What I have done is write a procedure to make it easier to display the results of testing, which is listed in full below. What it does is take an estimator function, the actual value being tested, and the number of trials to input to the estimator. It will then report the calculated estimate (and optionally how close it comes to the actual value).

(define (monte-carlo-test trials estimator actual-value)
  (let ((result (estimator trials))
  (display trials)
  (display " trials: ")
  (display result)
  (if showerror
        (display " % Error: ")
        (display (* 100 (abs (/ (- actual-value result) actual-value))))  

The procedure estimator should take one argument, the number of trials, and return an estimate. An example is given in the file that uses estimate-integral and a triangle-shaped region. This should work as a guide to write your own tests using the unit circle in order to get an approximation of π. Additionally, there are a few more tests added to explore what happens to the estimate when certain parameters are varied, such as defining a different circle for area testing or altering the bounding box for the Monte Carlo integration.

Exercises 3.1.2.


SICP 3.1.1 Solutions

April 2, 2017 17:49

Ex. 3.1.

Our procedure needs to produce something that is itself a procedure, since the accumulator adds the value when it is executed. To keep track of the value, the variable sum is adjusted using set!. Because sum is the argument originally passed to make-accumulator, it only exists locally in the accumulator that is produced from that call. It will therefore be independent of any other accumulators generated by this procedure, as each call creates a different local sum variable.

Ex. 3.2.

Here is the procedure in full:

(define (make-monitored proc)
  (let ((count 0))
    (define (how-many-calls?)
    (define (reset-count)
      (set! count 0)
    (define (dispatch arg)
        ((eq? arg 'how-many-calls?)  (how-many-calls?))
        ((eq? arg 'reset-count) (reset-count))
         (set! count (+ count 1))
         (proc arg)

This is modeled on the bank account procedures from the text. First, a local storage variable for the count is created (using let). Then there are sub-procedures for each operation. Finally, there is the dispatch procedure, which is what is actually returned.

Using sub-procedures allows dispatch to be simpler in form, with just one line for each special action performed. When none of the special messages match, the default case occurs. That means the count is incremented and the original procedure is called. About the only difference between this dispatch procedure and the one in make-account from the text is that here the else-case performs a delegation to the monitored procedure, and is the ‘normal path’. In make-account the else is a catch-all for unrecognized input.

While this monitoring function definitely provides us a powerful method for tracking how procedures are used, it is a bit simplistic in this form. The most obvious drawback is that the monitored procedures must take one (and only one) argument. One way to deal with this limitation is demonstrated in the last exercise, when we put make-monitored to use for testing the solution.

Ex. 3.3.

My intent in adding the password feature to the account was to make the least amount of changes to the original make-account. To this end, only the new required password argument is added, and it is handled in an if statement, separate from the cond of the original procedure. This is probably an influence on me of working in other programming languages, as it would be functionally equivalent to include this case as the first line of the cond statement. Either way, the password is the first thing checked, and nothing else should be processed if it is not the one provided when the account was created.

A potentially tricky aspect of this procedure is deciding how the ‘complaint’ should be reported. The text suggests that a message is returned. However, in order for the procedure to function properly, we cannot just return a message straight from dispatch. The way the account is used is that each interaction should resolve to a procedure that operates on a numeric value. Even though in this case that value should be ignored, it’s still necessary to return a procedure. My returned procedure is a lambda expression that takes one argument but just spits out the complaint instead of doing anything with the argument.

Ex. 3.4.

Since the variable that counts the number of attempts must be internal to each account, we should set it up using a let statement, as was done in make-monitored. To keep this variable separated from the dispatched procedures, the let that defines it wraps only around the dispatch procedure itself. The password lockout exists in the ‘gateway’ portion of the verification process, and should only be defined and accessed at that point.

There are two mechanisms that must act on the attempt variable. Firstly, each failed attempt must increment it, so that when the limit is reached the call-the-cops procedure is invoked. Secondly, there needs to be a way to reset it, so that only consecutive failures are flagged. This demonstrates the value of local state variables — having the account proceed through multiple, nearly identical states until reaching failure would in general be unwieldy as well as unintuitive.

One question that is not handled in the text is what exactly should happen to the account once call-the-cops has been called. Most modern systems with a similar limitation on login attempts go into a lockout state and often require additional action (or a timeout period) to unlock the account. The text doesn’t really require it, but I implemented a lockout by using a flag to show that call-the-cops had already been invoked. Another approach might be just a check to see if the number of attempts has exceeded the limit, but I find using a second variable to convey the actual state of the account to be clearer.

To test it, we can use the make-monitored function, since it can monitor whether the alarm procedure was called. In order to use it, however, we need to jump through some hoops. As previously mentioned, make-monitored only works with arguments that take one procedure. To get around that, we do this:

(define old-ctc call-the-cops)

(define monitored-ctc (make-monitored 
                       (lambda (x) 

(define (call-the-cops)
  (monitored-ctc null)

To make the monitored version, we use a lambda to create a function of a single argument. But since we don’t ever need the argument, it ignores it and does nothing but run (the original) call-the-cops instead. When we use it, we now need to pass an argument to be monitored, which is simply going to be ignored. Note that we check the counts for testing by calling monitored-ctc with the special monitoring messages, not our new call-the-cops. This slight logical hole is fine as long as the new version is the only function that ever calls it.

The reason we had to redefine the procedure to call our monitored version is that the account will only call the procedure named call-the-cops. On the one hand, this makes it easy to test our code without having to change the account procedure. On the other hand, it demonstrates one problem that arises when procedures can be arbitrarily renamed — simply replacing call-the-cops with something else could render the account security procedure impotent. (This isn’t so much a worry about malicious hacking as it is about the ability of programmers to unintentionally wreck their own or others’ code).

Solutions 3.1.1


SICP 3.1.1 Exercises

March 12, 2017 17:49

In Chapter 3, a completely new concept is introduced – storing values to create a local state. This changes our programs so that they have memory and can change over time. As we’ll see, that has benefits and drawbacks.

For these exercises, the tests are back to a simpler form. They won’t be wrapped in lambda statements, which means that any errors will cause the program to exit. This is partly to keep things simple for the new chapter, and partly to better understand how assignment affects the testing process. Since later tests will depend on the state of things after earlier tests, it’s better to exit as soon as one fails than to continue into potential chaos.

With the ability to have a local state in our programs, performing operations during the test itself may change that state. Previously, we could run our tests with any value and in any order, and the results of one test had no affect on any of the other tests. We now have to be aware of what the test operation does, and may need to pay attention to the order of the operations within a test.

This is not wholly undesirable behavior. In fact, much of the time we will want to run a test to determine that desired changes in state actually do occur. Thus following a sequence of operations will be a critical part of our testing toolkit. What we’d rather not have to worry about is whether things have changed based on the other tests we run. If we always need to monitor the state in order to write our tests, it becomes harder to modify them — adding or removing a single test might make it necessary to rewrite every test that follows after it. Not only that, but it can mean we aren’t testing our code as well as we thought, since the test might only be applicable to a very particular sequence that does not always occur in practice.

In this section, the tests are short and the sequences are not too hard to follow. Therefore some tests will go in sequence; I’ve tried to note when they rely on the current state. In future sections, we’ll do something to deal with the stored state during testing so that we aren’t expending effort to keep track of our system’s state when it’s not important to us.

Another exercise in this section demonstrates how assignment and state can be of help in our testing. By having procedures that store values in a testing context, we can monitor the process under test, and even gain information about it without having to change the underlying code. In fact, the test state storage function will immediately be put to use for testing two exercises after it is written. In that exercise, a monitored version will be used to determine whether the procedure that was expected to be run (call-the-cops) actually was.

Note that the testing code depends on being able to redefine existing variables; if your version of Scheme doesn’t allow this, the code won’t work as written. The monitored version also requires that call-the-cops does not take any arguments, which isn’t one of its requirements in the exercise. However, there isn’t likely to be any need for an argument in any version you come up with. Alternatively, if you do have call-the-cops take an argument, you can modify the monitoring to allow it.

Exercises 3.1.1


SICP 2.5.3 (Extended) Solutions

September 16, 2016 11:13

Ex. 2.93.

The main part of the exercise is simply replacing all operations within the rational package to use the generic procedures. Each + gets replaced with add, * with mul, etc. This change only needs to take place in the internal procedures like add-rat, demonstrating the usefulness of that abstraction. The constructors and the actual installed procedures don’t have to be rewritten.

The one place where the generic operations can’t be used is in the conversion to other types. We don’t need to support generic operations in the integers and reals, but preserving the ability to raise or drop to those values is critical. The changes to rational must not disrupt how it used to work (with the exception of reducing to lowest terms, which is explicitly removed). We need to be able to actually convert the values, but only if it’s possible to do so.

To accomplish this, I re-used a method that helped solve a similar problem with complex numbers. The method to-scheme-number will get the value wrapped up in the package ‘number’ so that it can be used by Scheme’s routines and passed to a constructor. By defining it for integers and reals, we can then project and drop the rational numbers. Here’s the code for project:

 (put 'project 'rational ; to integer 
       ; This can only be done with values that can be coerced,
       ; as quotient only works on Scheme numbers.
       (lambda (x) (let ((num-n (to-scheme-number (numer x)))
                         (den-n (to-scheme-number (denom x)))
                     (if (and num-n den-n)
                         (make-integer (quotient num-n den-n))

One change made was to have to-scheme-number return false when the value cannot be converted. This new version of project will check if the numerator and denominator were successfully converted before doing the Scheme operation of quotient on them. Polynomials cannot be converted, Constant-term-only polynomials could be handled by drop so rationals that include them are excluded from raising and dropping.

With these routines in place, polynomials (or any other value) can be used as the components of rationals, and all previous operations on rationals will still work, aside from reduction to lowest terms. However, it does mean that not all rational numbers can be used with each other, since we could be mixing polynomial rationals with other numbers. I’ll have more to say on this at the end of this section.

Ex. 2.94.

This exercise doesn’t require any difficult operations; it’s simply a matter of connecting up and calling the procedures correctly. Since there is already an internal procedure remainder, all remainder-terms needs to do is use it on the result from div-terms. The higher level gcd-poly is almost identical to the arithmetic operations: it verifies that the two polynomials are in the same variable, and then calls a term-list operation on them (gcd-terms here). Finally, the system operation greatest-common-denominator is just another instance of using apply-generic. Non-polynomial values can just rely on Scheme’s gcd procedure.

What’s notable is that in order to add these simple routines to the polynomials, a lot of ‘internal’ procedures need to exist in the package. It turns out that div-terms depends on both mul-terms and add-terms, and those in turn depend on low-level accessor routines like order and coeff. Even though none of these change, they need to be copied into the package, since they are not stored anywhere in between package updates. This is not a particular weakness of the design in itself — what would normally happen outside of this context would be that the install package would be kept in a single location or file, and updated for each new change without having to be copied.

I also took the time to redo subtraction at this point. Previously I had no term-list subtraction procedure, and it was handled only on the polynomial level using addition and negation. Since gcd-terms will repeatedly call div-terms, which will then require subtraction, I added the sub-terms procedure. It does the same process of addition and negation, but on a term-list level. This avoids the unnecessary step of creating intermediate polynomials just to subtract term lists.

Ex. 2.95.

When we do this operation, we get the following result:

 (2 (rational (integer . 18954) integer . 2197))
 (1 (rational (integer . -37908) integer . 2197))
 (0 (rational (integer . 1458) integer . 169)))

In a more conventional format, that is:

18954 x 2 - 37908 x + 1458
2197 2197 169

At first glance, that looks to be a jumble of almost random numbers. However, recall that we eliminated the reduction of rational values to simplest terms. If we reduce those fractions, we get something a little better:

1458 x 2 - 2916 x + 1458
169 169 169

The denominators all match, and if we compare 1458 and 2916, we find the second is twice the first. Considering the expected answer, we can restate the result as:

1458 ( x 2 – 2 x + 1 )

It’s almost the answer we expected to see. It’s just scaled by the curious fraction 1458/169. So how did that get into the result? The answer is in the polynomial division algorithm. If we work it by hand, we notice something that occurs from the very first step.

polynomial division

The algorithm will divide the coefficients by each other, and if they don’t divide evenly, the result will be a rational number. When multiplied by the remaining terms and added, the denominator will be present in all of them. As the division proceeds, the leading coefficient of the divisor will be multiplied every time. Even if we were reducing fractions, that factor will be introduced at each step, and will not be reducible unless it somehow divides every term of the dividend.

For polynomial division, the goal is not to produce integer results, but merely to eliminate the leading term by subtraction. It’s interesting to note that while this algorithm resembles the traditional long division algorithm, it works subtly differently, since we are not always able — or even trying — to select values that give us integers in the quotient. We can only eliminate the leading term of the dividend using the leading term of the divisor, and we do so using the reciprocal of their coefficients.

By way of comparison, look at the difference in results if we worked in this fashion on a decimal value, treating it as a polynomial (effectively with x = 10):

decimal division

The result is a string of strange fractions. It looks to be just as much a mess as the GCD result, as it has the same sort of built-up fraction. But it does in fact sum to the proper quotient of 211 (despite the ‘remainder’). Similarly, as long as our system can store fractions properly, the calculated GCD result from the arithmetic system would indeed divide the two polynomials. Nothing actually is wrong with the division of polynomials. The problem goes a bit deeper than that.

The real issue is, just what does the ‘Greatest’ part of ‘Greatest Common Divisor’ mean when applied to polynomials? In case you have’t thought about it, there is actually no way to order all the polynomials, so we can’t simply select the ‘largest’ polynomial from the set of those that divide two others. Two polynomials cannot be reliably compared in the same way two integers can. We therefore need to have a proper definition of what GCD really means here. The text glosses this by stating “the notion” of the GCD makes sense for polynomials (which it does), but only somewhat circularly implies that the GCD might have been the result produced by Euclid’s algorithm.

A proper definition of the GCD for the polynomials This definition applies to a broader class uses two rules, along with a strict definition of what division means.

By saying ‘A divides B’ what we mean is that there exists Q such that B = A * Q.

D is a GCD of P1 and P2 if these two conditions hold:
1. D divides both P1 and P2.
2. If C divides P1 and P2, then C also divides D.

Therefore D will be ‘greatest’ only in the sense that it includes all factors that divide both P1 and P2. For polynomials, there will be more than one GCD. That would seem to frustrate our attempts to compute and compare the GCD, but as it turns out, all GCDs only differ from each other by a constant. That this is true isn’t too hard to show, since all GCDs must divide each other (see here for a proof of this along with several other statements that apply to polynomial division). But the very fact that multiple GCDs exist is the reason why we didn’t get the answer we expected.

Euclid’s algorithm gives us a GCD, but only one of an infinite possible number of them. What we need to do, then, is to decide on a process that can find a GCD without the build-up of messy rational values, especially since at some point we might not be able to store the fractions with the desired precision. The remainder of the exercises will find a way to do this, and give us a ‘sensible’ GCD.

Ex. 2.96.


This procedure is essentially the same as remainder-terms, but we need to compute the integerizing factor first. The only difficulty it presents is that we have to raise the first coefficient to the power determined by the order of the two polynomials. Since polynomial coefficients are in-system arithmetic values, we have two choices: Either implement exponentiation as a generic operation, or convert the coefficients and operate on them using Scheme procedures. While the first is arguably better as the operation could be useful in the system, I’ve gone the second route here, using to-scheme-number to force it into a value that will work with expt. That is a built-in math procedure that raises its first argument to the power of the second argument.

(let ((intc (expt (to-scheme-number (coeff (first-term b)))
                      (- (+ 1
                            (order (first-term a))
                         (order (first-term b))

Note that getting the exponent is easy, as we can use order on the terms. The constructed integerizing factor is then put into a temporary constant term, which can be multiplied by the dividend term list using mul-term-by-all-terms, and that’s it for calculating the pseudoremainder.

That cleans up the rational values, but testing confirms that we still don’t get the GCD we want.


We have to use our generic greatest-common-divisor operation on the term list, since the coefficients are in-system values. It only allows two values at a time, so to figure it out from a list of more, I did it recursively, taking the GCD of the cdr of the list first and then finding the GCD of that and the first element (i.e. the car). One alternative might be to take elements of the list two at a time, and repeating the process with a list that gets split in half each time.

I made a slight change to the exercise here, and didn’t even notice that I had done it until writing this up. Instead of modifying gcd-terms, I reduced the coefficients one step higher up, in gcd-poly. This is really a matter of style. To my mind, I don’t see the reduction step as a necessary element of computing gcd-terms, which is to say I’d rather keep it as-is and do the reduction outside of it. An argument could be made that all the GCD calculation should be performed at the lowest level, to keep the polynomial-level procedure simpler. It ultimately doesn’t appear to have an impact on performance of this procedure — modifying gcd-terms directly would simply mean changing the terminal branch from returning just a to returning it with the GCD of the coefficients divided out.

Ex. 2.97.


I implemented the algorithm with a let at each major step. That allows for a sequence of calculations with each result being labeled. First, the GCD is computed. Next, the integerizing coefficient is calculated, using essentially the same method as in 2.96-a. This allows us to compute the (integerized) numerator and denominator with the GCD divided out. The coefficient GCD is then calculated, using gcd-coeffs from 2.96-b, and the generic GCD operation on the two results of that. This is finally converted to a constant term, and the returned result will have that constant divided out of both the numerator and denominator.

Incidentally, my earlier design decision (not changing gcd-terms to remove common factors) now turns out to have an effect on this procedure’s performance. We do compute the GCD of the terms, but go on to remove the common factors from the fraction at the very end. That means that any earlier scaling down the GCD coefficients is to some extent ‘wasted’ effort. Since I don’t do it in gcd-terms, that will be avoided. On the other hand, the unreduced GCD may then produce a larger integerizing factor in the second step, which will lead to larger numbers in the multiplications. On balance, this is likely a minor performance increase, since mathematical calculations are generally much quicker to perform than running through the extra procedure.

The procedure reduce-poly is almost identical to div-poly, although I did not go so far as to create custom accessors analogous to quotient and remainder.


Adding a generic operation should come fairly easily now. The top-level procedure calls apply-generic with the operation. Each type that implements the operation needs to create an updated package. For the integers, there isn’t much else needed in the package. I did also create a reduce operation for complex, which is more of a ‘rationalization’ – it turns complex fractions into a single complex value with rational coefficients, by using the complex conjugate.

In calculating the reduced values, I also added in a reduced tag, which isn’t really necessary. It’s use was mostly for debugging, but there could be cases where knowing whether the value you have is reduced or not might help. To keep it from interfering with the existing generic operations, it’s defined to go away on any drop.

I’d like to mention one more thing about the way we chose only one GCD out of many. It’s sort of implied in Exercise 2.95 that P1 could be anything, but consider for example if P1 was 2 x 2 – 6 x – 4. If you run this through the same process, you find that after the reduction step the GCD of Q1 and Q2 is not equal to P1. It will differ by a constant, since this P1 has a common factor in each term, and that gets removed when the GCD is computed. What is the choice we made for the GCD? It’s the one with the smallest (in absolute value) integer coefficients. This has the advantage of ensuring that if we start with integer polynomials, they’ll stay that way, but other reasonable choices can be made (for instance, making a monic polynomial where the leading coefficient is always 1). The ‘correct answer’ here only matches because the first polynomial met the same conditions as our choice for GCD.

Lastly, with these new additions to the system, we’ve added to its complexity in a big way. Previously, we could use rational values in any operation. Now, however, we can’t really be certain of whether that rational value is a numerical or a polynomial fraction. In other words, we would have to develop other features of the system in order to actually make this ‘generic’ system of rationals work. Of course, that’s well beyond the scope of this exercise set, but it’s worth keeping in mind that while the initial addition of functionality was relatively easy to implement thanks to our abstractions, it does not make it any less complex in practice.

Solutions 2.5.3


SICP 2.5.2 Exercises

August 30, 2015 10:58

In this section the generic arithmetic package continues to expand. It will now allow operations that mix the types of numbers allowed. While the system works nearly identically to the one in 2.5.1, there are some new types that the exercises rely on. To avoid crowding the definitions into the exercise file, both the new types and the tests that deal with them have been moved into separate files. These two are called ‘gen-arith252.scm’ and ‘gen-arith-tests_v2.scm’, and are located in the library directory. The tests have been rewritten so that they can apply to the old system (which is used in the first two exercises) and the new ‘tower’ system of hierarchical types.

There is one wrinkle in this: the Integer type will fail some of the arithmetic property tests, because integer division does not always produce an integer result. It would probably be best to further separate the arithemtic property tests into addition and multiplication tests, but since we can run failing tests without exiting the program, I’m overlooking it for now. There is also a hack at the start of the file that lets us use the new tests on the original system. Scheme numbers are used to stand in for Integer and Real types.

Exercise 2.82 deals with operations that require an arbitrary number of arguments. While I did define a new operation to examine how coercion may work, I have not included any that takes more than two arguments. For this case, using the previous tests should suffice.

I’d like to explain a feature of the tower/hierarchical system that I included for Exercises 2.83 and later. In order to have a progression from rationals to reals (and back), an arbitrary ‘rationalization’ can be performed on a real. This allows it to be expressable as a fraction (rational) as long as there is an equivalent (sidenote) ‘equivalent’ as defined by equ? fractional representation that does not require a denominator larger than a given number. The cutoff for denominator size is arbitrary because all values of the Real type, which internally are floating-point values, are actually just high-precision fractions and could thus always be expressed as a Rational.

Implementing this feature requires using getdenom to project a Real in Exercise 2.85. The alternative is to simply disallow dropping of Reals to Rationals. This is acceptable, but be aware that some of the tests I’ve written may fail if that is the case.

As the exercises go on, there are continual changes to the system. Each time we can go back and run the same basic tests. Because the system interface remains the same, the tests continue to work, and this ensures we didn’t wreck anything that was previously working. Typically, a few new tests are added to verify the specific feature just added or modified.

In truth, the number of additional tests I’ve written really do very little to exercise the system, but as this section is fairly intensive, and I’ve already spent far too long working on just the solutions, I didn’t want to add even more tests that would need to be figured out. Hopefully by this point you’re able to identify what features to look for and can write a few tests of your own. It’s never wrong to go back and modify the tests themselves as the system develops and changes.

Exercises 2.5.2

Generic Arithmetic system definitions



SICP 2.5.1 Solutions

August 9, 2015 10:28

Ex. 2.77.

The procedure magnitude is defined to call apply-generic with the argument given, which happens no matter what the number is. The error occurs because it cannot find an entry for magnitude with the type complex. The newly-added definitions are rather simple. All they do is call magnitude (or real-part, etc.) again. However, this time the argument is different. Whenever apply-generic is used, it strips off the initial type tag. Since complex numbers in the system have two type tags, the second call to magnitude will retrieve the appropriate function for the secondary type (either rectangular or polar).

While it’s not a bad idea to go through the process of calling magnitude manually, we can use a programmatic trace to verify that we’re on the right track with this line of thought. With it enabled, the series of procedure calls is as follows (this is not code, but the trace output with my notations):

(magnitude      (complex rectangular 3 . 4) )  ; global version
(apply-generic  magnitude (complex rectangular 3 . 4))
(magnitude      (rectangular 3 . 4))           ; global version
(apply-generic  magnitude (rectangular 3 . 4))
(magnitude   (3 . 4))  ; inside the rectangular package
; Calls as part of the magnitude function
(real-part  (3 . 4))   ; rectangular version
(imag-part  (3 . 4))   ; rectangular version

We can see that the system goes through two calls to apply-generic for the magnitude, once with a number tagged as complex, and the second time with the number as rectangular. The call with a rectangular type then looks up the version of magnitude that was added with the rectangular package. The third call to magnitude in our trace is not the one defined at the top level, but the one inside install-rectangular-package. (The trace indicates this by marking the source file and line number for the call; I’ve stripped that information out here.) There are also two more function calls that occur in the course of calculating the magnitude, which are still the versions local to the rectangular package.

Ex. 2.78.

By modifiying the suggested procedures, we can make a type check for numbers by explicitly using number? instead of checking the type tag. Because we want to store the number as itself, we do not attach the tag in attach-tag but simply return the contents for an actual number. Within contents, we return the thing itself when it is a simple number (in fact, we could probably get away with just returning the contents if they are not a pair). The final change is to type-tag. To keep the system working as before, we still need to indicate the tag of a regular number as a scheme-number.

A final note: These modifications do not actually change any numbers that are made using make-scheme-number, but they do allow them to be used alongside bare numbers. All calculations involving them will now end up returning bare values, however, as show-basic-arithmetic demonstrates.

Ex. 2.79.

Since we need equ? to work for each type in the system, we have to consider whether we need a version for each type or if one can cover them all. It turns out to be more sensible to apply different standards of equality based on the type. Here are the three versions:

; Rational
(put 'equ? '(rational rational)
       (lambda (a b) (= (* (numer a) (denom b)) (* (numer b) (denom a))))

; Ordinary/Scheme number
(put 'equ? '(scheme-number scheme-number)
     (lambda (a b)  (close-enough? a b))

; Complex
(put 'equ? '(complex complex)
     (lambda (a b) (and (close-enough? (real-part a) (real-part b))
                        (close-enough? (imag-part a) (imag-part b))

Rationals are the easiest to explain. A simple version would compare numerator to numerator, and denominator to denominator, and define equality if they both match. However, what about a case like 6/8 compared to 3/4? These are equal values, but they would not be considered equal using that simple version. A better definition of equality allows for non-reduced rationals to be equal to each other, and the easy way to do this is cross-multiply the numerators and denominators: if the products are equal, the numbers are equal.

For ordinary numbers, it might seem easy enough to use the built-in equality operator and be done with it. However, because this is an arithmetic system where we’ll be performing calculations, we may want a slightly different definition of what equality is.

The reason for this has to do with how floating-point values are actually represented as data. A floating-point value is stored as a binary fraction. That fraction may not be able to exactly match the decimal value we want it to. After a calculation, the result might then be represented as a very tiny bit different than the exact expected value. Using a strict definition of equality would make it hard to use equ? in a meaningful way.

It’ll be more useful to define two numbers as equal if they are within some very small range of each other. Ideally we would choose the maximum difference between a number and the decimal values that it approximates, to maintain precision. However, once we start performing calculations, the tiny differences can build up quickly, and the error in the result is difficult to predict. Using an arbitrary larger value will be sufficient for our purposes.

A more complete understanding of floating-point math in binary systems can be found by reading this classic article, or for a shorter and more simplified version, this website .

The complex numbers highlight the need for this approach. Our complex numbers internally use mathematical functions involving floating-point values. If we do not use an approximation comparison, it would be difficult to ever have a rectangular and a polar number that are equal to each other, which is a problem when the system ought not to care which underlying representation is used.

To test equality for complex numbers, the function close-enough? is used. It compares the real parts and imaginary parts separately. By defining equ? at only the level of complex, we can compare any two complex numbers without caring about their secondary type. We could have used magnitude and angle; either way we only need one. This naturally relies on the additional selectors added in Exercise 2.79 to make the complex package work properly.

Ex. 2.80.

With equ? defined, it’s pretty easy to see that =zero? works along similar lines. Each type of number will have its own operator in the table.

For ordinary numbers, we merely need to see if it is close enough to 0.

Rationals and complex numbers are slightly different. For a rational number, only the numerator needs to be equal to zero, and so a simple check of that is sufficient. Complex numbers are considered to be zero if they have zero magnitude, and that is the check we make. In my solutions, I used the =zero? operator in both of those definitions. This only works if we know that values returned by the numer and magnitude function will work in the arithmetic system, but since we added ordinary Scheme numbers, we know that they will at this point.

Extra: Expanding on testing and the zero-angle complex number problem

Once equ? and =zero? have been added to the system, it becomes possible to run the full set of tests. For each type of number, we run the basic arithmetic property tests, and then there are a few tests to check on actual computations. This isn’t terribly exhaustive, but it can catch at least some simple problems with the system.

One thing we discover is that the complex numbers (specifically, the rectangular ones) do not pass some of the tests. This problem was already mentioned in 2.4.2. The basic difficulty is this: using the simple conversion method, the real-imaginary number 0 + 0 i has an undefined angle value. Any calcuation that relies on computing its angle will cause an error.

My fix was to add a check in angle that determines if the real and imaginary parts are both 0. If that is so, it just returns 0 for the angle.

At this point our complex numbers pass all the tests, but since we know this could be a problem, we ought to add a few tests for this specific situation. Why add more tests when we have some that cover the situation? The primary reason is that the tests themselves provide information on how the system works. Adding a specific test is a reminder that we require this functionality. It also allows for the tests to be changed without worrying that we could be losing a ‘hidden’ test.

Just adding one test can also make us think of others that we may want (sometimes we might not need to add the extra tests, but just the consideration can help in understanding how we actually want the system to work). For example, in this case I went on to add a few tests that describe how zero-valued complex numbers (and also =zero?) should work.

(define (zero-angle-tests)
  (let ((z-ma (make-complex-from-mag-ang 0 0))
        (z-alt (make-complex-from-mag-ang 0 2)) 
        (z-ri (make-complex-from-real-imag 0 0))
        (c1 (make-complex-from-real-imag 3 5))
        (nc1 (make-complex-from-real-imag -3 -5))
    (displayln "Checking angle of zero-length complex numbers")
    (test-equ (lambda () (angle z-ma)) 0 "polar complex number with zero values has angle of 0")
    (test-equ (lambda () (angle z-alt)) 0 "polar complex number with non-zero angle value has angle of 0")  ; optional condition
    (test-true (lambda () (=zero? z-alt)) "polar complex number with non-zero angle is =zero?") ; optional condition
    (test-equ (lambda () (angle z-ri)) 0 "rect. complex number with zero values has angle of 0")
    (test-equ (lambda () (angle (add c1 nc1))) 0 "result of computation with rect. numbers has angle of 0")

Let’s look at what each of these tests does.

The first test makes sure that polar numbers with both values 0 work as expected. One could argue over whether this test is strictly necessary, since polar numbers are working correctly. In general, values of 0 tend to be problematic, so I’d keep it in.

Adding that test also made me realize that polar numbers of length 0 can have any angle. There remains the open question of whether such numbers should also be equal to each other, and whether their angle should always be considered 0. The next two tests cover those choices and can be considered optional. My own solution doesn’t force the value to 0, for instance.

The next one checks that rectangular numbers have the correct angle value, which is the issue the fix is intended to cover. We also want to make sure that the fix works even when we don’t create the number explicitly. A zero-valued number can arise as the result of a calculation, and so the last test is there to catch that. If you look at the implementation, you may note that the calculations actually end up calling the constructors, which would seem to make that test unnecessary. However, our tests should not make assumptions about implementation, unless it is a critical feature of the system. In this case it’s debatable whether that is so.

Over the course of the book we’ll see situations like this one, in which not all of a system’s behavior is delineated in the text. Finding ways to test it often raises questions that I answer for myself in the tests, but I would encourage you to alter and extend the tests as you see fit, since it is likely you will notice additional aspects that I have not considered.

Solutions 2.5.1


SICP 2.5 and testing

June 28, 2015 10:25

This is an extra post in which I’ll discuss some more advanced testing procedures, that will be used in the next section. It requires one special type of function not handled in the book, but otherwise only uses concepts we’ve already covered. While this testing system is not as full-featured or as elegant as something like rackunit, it is a definite improvment over a simple list of check functions. Understanding how it works is entirely optional; the tests can be applied without much explanation merely by having the files present.

The first part of the test-functions file is a set of check functions. These work the same as those used in older exercises. The only difference is that the underlying procedures check for failure instead of a true state. These check functions take the opposite of that result, and thus work by ensuring that the false outcome does not hold. The reason for this will be clearer in a moment, but for now let’s compare a few of the failure and check functions.

(define (fails-equ? expected observed)
  (if (equ? expected observed)
      (format "FAILURE: ~a is not equ? to ~a." observed expected)

(define (fails-=? expected observed)
  (if (= observed expected)
      (format "Check = FAILED.  Expected ~a but was ~a." expected observed)

(define (fails-equal? expected observed)
  (if (equal? expected observed)
      (format "Equality FAILED. Results were : ~a." observed)
(define (check-equal expected observed) (not (fails-equal? expected observed)))
(define (check-= expected observed) (not (fails-=? expected observed)))
(define (check-equ expected observed) (not (fails-equ? expected observed)))

These three functions check similar operations, and consequently they have a similar format. There is an if statement that uses the particular function for the check, and works with an ‘expected’ and ‘observed’ argument. They all vary in the way that non-failure is reported; you can decide what the merits and failings of each style is. (Note that the check functions don’t use this report, they only return a true or false; the reports will be used elsewhere).

Using true-false check functions is the same thing we’ve done before. They determine whether we have the correct answer or the wrong one. However, very often the problem in our code causes an error to occur, stopping execution immediately. That means it’s only possible to look at one problem at a time, and each issue must be fixed in sequence before continuing. That can make it tougher to figure out exactly what is going wrong, especially in a more complex system. To get around that problem, we need a different sort of function. This new type of function I’ve named using test to distinguish them from the check functions.

(define (test-equal observed expected . nameargs)
  (let ((testname (if (null? nameargs) "test-equal" (car nameargs)))
  	  (exec-test fails-equal? observed (list expected) testname)

This is a test function for equality. The first line assigns testname from nameargs, or uses a default name of test-equal if nameargs is empty. This allows the test name to be optional. We then call a function exec-test to actually perform the test. The second argument, the expected value, is passed via a list, and we use the same failure-checking function for equal? that check-equal had.

To really understand the system, we need to know what that call to exec-test does. Moving on to that procedure, we see this:

(define (exec-test test-procedure expression-under-test other-args testname)
      ([catch-all exc-display])  ; list of handlers
      (let ((failure (apply test-procedure (cons (expression-under-test) other-args)))
        (if (true? failure)
              (display testname)
              (display ": ")
              (if (string? failure)
                  (display failure)
                  (display "FAILED")
            (display "pass")
(define (catch-all exc)
  (exn:fail? exc)

(define (exc-display exc)
  (display "ERROR: ")
  (display (exn-message exc))

The function with-handlers is a special form that takes a list of handler functions and an expression. What it does is execute a given expression within a sort of bubble. Inside this bubble, errors do not lead directly to the program exiting. Program control is instead passed to the handler functions when an error occurs in that expression. Once the handler is done, the with-handlers block can exit normally and the program can proceed.

This is generally known as exception handling (or ‘error handling’) and is a feature of many programming languages. When an error occurs, an object known as an exception is created, which may contain some information about what went wrong. This allows either the interpreter or some other mechanism to decide what to do with it. All the errors you’ve encountered if you use Racket are in fact exceptions, being handled with the ‘default’ handler. While Scheme has never had a formal implementation for exceptions, most variations on the language have done something like Racket’s with-handlers procedure. Without getting into the details, we can see how it works by using these procedures as examples.

The first argument to with-handlers is the list of handler pairs, which are the procedures used to identify and handle the exceptions. (The use of square brackets here is Racket-specific and not worth going into; it’s effectively still a pair). The first element of the pair is an identification procedure that will be given the exception, and return true if we are willing to handle it. The second element is the actual procedure that will be used to handle it. This approach allows different kinds of exception handlers to be processed by different handlers.

Our handler id function catch-all is set up to handle every kind of exception, aside from explicit breaks (so that if an infinite loop occurs, you can still terminate execution). Then we have the actual handler exc-display, which is what gets executed once an exception occurs that we are willing to handle. In our case we want to report the error for our test and continue. The built-in function exn-message lets us get the message associated with the exception, and that’s what we can output to indicate an error in the test.

With our handlers in place, we can get on with the actual execution of a test. This is done by assigning to failure the result when we apply the test procedure using the arguments given. There’s also something special done with the ‘expression under test’ as it is passed to apply: it is executed as a procedure. Looking back at our test functions, we see that this is what ‘observed’ was, and therefore we know it must be a procedure. The reason for doing this is so that the observed value is computed within the with-handlers block. If it were simply passed as an argument, the expression would be evaluated as an argument, outside of the bubble, and we would not gain the benefits of error handling.

This special treatment to ensure execution inside the exception-handling bubble is only used on the one expression. That does make the observed argument unique in the tests. While this was done here merely as a matter of convenience, there could be some value in treating the tests in this fashion. It would enforce the condition that all computations that might result in error are confined to the ‘observed’ section, not the ‘expected’ answer. However, it also makes testing slightly less flexible, as there are situations where it’s more natural and preferable to use computed values for the expected results.

Whatever test-predicate is, it is supposed to return false if nothing went wrong, and may return anything else if failure occurs. This allows for newly-written test functions to report failure in any format desired. Success is merely indicated with a ‘pass’ message. It’s a convention in testing that success should be quiet, and report nothing or nearly nothing, since it’s only the failures that require attention. Tests typically are run repeatedly (even continuously if possible) and generating a lot of ‘success’ messages can make it harder to find the parts that actually need fixing.

Exception handling also allows us to add another type of test: one to ensure that an expected error actually does occur. This can be quite useful, as there are occasionally exercises that require an error on certain improper input.

Testing Example

To see how this works in action, here are some examples from the tests used for the ‘Generic Arithmetic’ system:

(test-equ (lambda () (mul n1 one)) n1
	"multiplicative identity works"
(test-equ (lambda () (mul n1 n2)) (mul n2 n1)   ; * computed 'expected' value *     
	"multiplication commutes"

; From Scheme number tests
(test-equ (lambda () (add s2 s1)) 11.47) 
(test-false (lambda () (equ? (add s3 one) s3)))

We see that each test requires its first argument to be a procedure, and this is accomplished using a lambda expression with no arguments. (A similar approach was used when measuring execution time in Racket). The first two tests also provide the optional name, which is only displayed if a failure occurs. Note that if errors occur we cannot display the test name, since that isn’t provided as part of the exception data.

The second test shown here highlights the potential for problems when only one ‘expected’ value is allowed. If an error occurs in (mul n2 n1), the program execution will be halted. A possible way around that is to make it similar to the format in the last test, which uses test-false and only requires one argument.

What’s important to test is that the two expressions yield identical results. Neither is really being tested more than the other, so using ‘observed’ and ‘expected’ in this manner is arguably inaccurate. On the other hand, adding a test-true wrapper is like adding extra words to the expression, making it slightly harder to see what’s happening. I prefer the expression as given, since it’s more concise. Feel free to modify it if you disagree.

The first file given below is just the test function framework. The second one contains tests used for the next set of exercises. If you are not using Racket, it should be possible to modify the test-functions file and leave the remaining tests as they are (as long as your version of Scheme can handle exceptions).

Test Functions

Generic Arithmetic Tests_


SICP 2.5.1 Exercises

June 28, 2015 15:53

In this section we’re building up the components of the previous section into a system with a generic interface. This system will be gradually altered over the course of 2.5, but its core functionality will remain the same. As we’ll see, the interface will largely be unchanged as well. This means that our tests can be re-used, or will only require slight alterations, as we go through this section.

The tests are included in a separate file. In this section, they’re also written using a simple test system that I wrote. You can use it with the exercises just by including it in your library directory (or however you choose to organize your files). If you want a better understanding of it, refer to the previous post.

In Exercise 2.79, the advantage of using the test functions is displayed, by contrasting them with the simpler check functions. When we try to do something that the system can’t handle (in this case, comparing a ‘regular’ Scheme number with a rational number), an error occurs. When using the test functions, the error is still indicated but the program can continue. Those check functions will need to be commented out or deleted, as we cannot make this sort of comparison [yet].

If you take a look at the test file for the arithmetic system, you can see that we’re reaping the benefits of the generic interface by using the same set of tests for multiple types of numbers. There’s a test for basic arithmetic properties that is called in this fashion:

(arith-property-tests zero one n1 n1-ai n1-mi n2)

By using arguments of the appropriate type, we’re able to apply these tests to any new number type we add to the system. Even as we change the implementation or add features, this set of tests will remain useful to ensure that the number type still satisfies some simple conditions, and should not need to change. This also means that some of the tests will result in an error when the full system has not been implemented. Luckily, our test framework can handle this and continue past them.

There are also a few tests that may or may not be necessary, as they cover aspects not specifically required of the system. Depending on what you think of as appropriate behavior to define, those tests can be included, altered, or ignored.

Finally, there is a flaw in the complex number implementation as written. It should be easy enough to fix, although as an extra exercise try to write some tests to cover the specific problem exhibited.

See the previous post for the required testing files.

Exercises 2.5.1