Quirksand

SICP

SICP 2.5.3 (Extended) Solutions

September 16, 2016 11:13

Ex. 2.93.

The main part of the exercise is simply replacing all operations within the rational package to use the generic procedures. Each + gets replaced with add, * with mul, etc. This change only needs to take place in the internal procedures like add-rat, demonstrating the usefulness of that abstraction. The constructors and the actual installed procedures don’t have to be rewritten.

The one place where the generic operations can’t be used is in the conversion to other types. We don’t need to support generic operations in the integers and reals, but preserving the ability to raise or drop to those values is critical. The changes to rational must not disrupt how it used to work (with the exception of reducing to lowest terms, which is explicitly removed). We need to be able to actually convert the values, but only if it’s possible to do so.

To accomplish this, I re-used a method that helped solve a similar problem with complex numbers. The method to-scheme-number will get the value wrapped up in the package ‘number’ so that it can be used by Scheme’s routines and passed to a constructor. By defining it for integers and reals, we can then project and drop the rational numbers. Here’s the code for project:

 (put 'project 'rational ; to integer 
       ; This can only be done with values that can be coerced,
       ; as quotient only works on Scheme numbers.
       (lambda (x) (let ((num-n (to-scheme-number (numer x)))
                         (den-n (to-scheme-number (denom x)))
                         )
                     (if (and num-n den-n)
                         (make-integer (quotient num-n den-n))
                         false
                         )
                     )
         )
       )

One change made was to have to-scheme-number return false when the value cannot be converted. This new version of project will check if the numerator and denominator were successfully converted before doing the Scheme operation of quotient on them. Polynomials cannot be converted, Constant-term-only polynomials could be handled by drop so rationals that include them are excluded from raising and dropping.

With these routines in place, polynomials (or any other value) can be used as the components of rationals, and all previous operations on rationals will still work, aside from reduction to lowest terms. However, it does mean that not all rational numbers can be used with each other, since we could be mixing polynomial rationals with other numbers. I’ll have more to say on this at the end of this section.

Ex. 2.94.

This exercise doesn’t require any difficult operations; it’s simply a matter of connecting up and calling the procedures correctly. Since there is already an internal procedure remainder, all remainder-terms needs to do is use it on the result from div-terms. The higher level gcd-poly is almost identical to the arithmetic operations: it verifies that the two polynomials are in the same variable, and then calls a term-list operation on them (gcd-terms here). Finally, the system operation greatest-common-denominator is just another instance of using apply-generic. Non-polynomial values can just rely on Scheme’s gcd procedure.

What’s notable is that in order to add these simple routines to the polynomials, a lot of ‘internal’ procedures need to exist in the package. It turns out that div-terms depends on both mul-terms and add-terms, and those in turn depend on low-level accessor routines like order and coeff. Even though none of these change, they need to be copied into the package, since they are not stored anywhere in between package updates. This is not a particular weakness of the design in itself — what would normally happen outside of this context would be that the install package would be kept in a single location or file, and updated for each new change without having to be copied.

I also took the time to redo subtraction at this point. Previously I had no term-list subtraction procedure, and it was handled only on the polynomial level using addition and negation. Since gcd-terms will repeatedly call div-terms, which will then require subtraction, I added the sub-terms procedure. It does the same process of addition and negation, but on a term-list level. This avoids the unnecessary step of creating intermediate polynomials just to subtract term lists.

Ex. 2.95.

When we do this operation, we get the following result:

(polynomial
 sparse
 x
 (2 (rational (integer . 18954) integer . 2197))
 (1 (rational (integer . -37908) integer . 2197))
 (0 (rational (integer . 1458) integer . 169)))

In a more conventional format, that is:

18954 x 2 - 37908 x + 1458
2197 2197 169

At first glance, that looks to be a jumble of almost random numbers. However, recall that we eliminated the reduction of rational values to simplest terms. If we reduce those fractions, we get something a little better:

1458 x 2 - 2916 x + 1458
169 169 169

The denominators all match, and if we compare 1458 and 2916, we find the second is twice the first. Considering the expected answer, we can restate the result as:

1458 ( x 2 – 2 x + 1 )
169

It’s almost the answer we expected to see. It’s just scaled by the curious fraction 1458/169. So how did that get into the result? The answer is in the polynomial division algorithm. If we work it by hand, we notice something that occurs from the very first step.

polynomial division

The algorithm will divide the coefficients by each other, and if they don’t divide evenly, the result will be a rational number. When multiplied by the remaining terms and added, the denominator will be present in all of them. As the division proceeds, the leading coefficient of the divisor will be multiplied every time. Even if we were reducing fractions, that factor will be introduced at each step, and will not be reducible unless it somehow divides every term of the dividend.

For polynomial division, the goal is not to produce integer results, but merely to eliminate the leading term by subtraction. It’s interesting to note that while this algorithm resembles the traditional long division algorithm, it works subtly differently, since we are not always able — or even trying — to select values that give us integers in the quotient. We can only eliminate the leading term of the dividend using the leading term of the divisor, and we do so using the reciprocal of their coefficients.

By way of comparison, look at the difference in results if we worked in this fashion on a decimal value, treating it as a polynomial (effectively with x = 10):

decimal division

The result is a string of strange fractions. It looks to be just as much a mess as the GCD result, as it has the same sort of built-up fraction. But it does in fact sum to the proper quotient of 211 (despite the ‘remainder’). Similarly, as long as our system can store fractions properly, the calculated GCD result from the arithmetic system would indeed divide the two polynomials. Nothing actually is wrong with the division of polynomials. The problem goes a bit deeper than that.

The real issue is, just what does the ‘Greatest’ part of ‘Greatest Common Divisor’ mean when applied to polynomials? In case you have’t thought about it, there is actually no way to order all the polynomials, so we can’t simply select the ‘largest’ polynomial from the set of those that divide two others. Two polynomials cannot be reliably compared in the same way two integers can. We therefore need to have a proper definition of what GCD really means here. The text glosses this by stating “the notion” of the GCD makes sense for polynomials (which it does), but only somewhat circularly implies that the GCD might have been the result produced by Euclid’s algorithm.

A proper definition of the GCD for the polynomials This definition applies to a broader class uses two rules, along with a strict definition of what division means.

By saying ‘A divides B’ what we mean is that there exists Q such that B = A * Q.

D is a GCD of P1 and P2 if these two conditions hold:
1. D divides both P1 and P2.
2. If C divides P1 and P2, then C also divides D.

Therefore D will be ‘greatest’ only in the sense that it includes all factors that divide both P1 and P2. For polynomials, there will be more than one GCD. That would seem to frustrate our attempts to compute and compare the GCD, but as it turns out, all GCDs only differ from each other by a constant. That this is true isn’t too hard to show, since all GCDs must divide each other (see here for a proof of this along with several other statements that apply to polynomial division). But the very fact that multiple GCDs exist is the reason why we didn’t get the answer we expected.

Euclid’s algorithm gives us a GCD, but only one of an infinite possible number of them. What we need to do, then, is to decide on a process that can find a GCD without the build-up of messy rational values, especially since at some point we might not be able to store the fractions with the desired precision. The remainder of the exercises will find a way to do this, and give us a ‘sensible’ GCD.

Ex. 2.96.

2.96-a.

This procedure is essentially the same as remainder-terms, but we need to compute the integerizing factor first. The only difficulty it presents is that we have to raise the first coefficient to the power determined by the order of the two polynomials. Since polynomial coefficients are in-system arithmetic values, we have two choices: Either implement exponentiation as a generic operation, or convert the coefficients and operate on them using Scheme procedures. While the first is arguably better as the operation could be useful in the system, I’ve gone the second route here, using to-scheme-number to force it into a value that will work with expt. That is a built-in math procedure that raises its first argument to the power of the second argument.

(let ((intc (expt (to-scheme-number (coeff (first-term b)))
                      (- (+ 1
                            (order (first-term a))
                            )
                         (order (first-term b))
                         )
                      )
                )
          )

Note that getting the exponent is easy, as we can use order on the terms. The constructed integerizing factor is then put into a temporary constant term, which can be multiplied by the dividend term list using mul-term-by-all-terms, and that’s it for calculating the pseudoremainder.

That cleans up the rational values, but testing confirms that we still don’t get the GCD we want.

2.96-b.

We have to use our generic greatest-common-divisor operation on the term list, since the coefficients are in-system values. It only allows two values at a time, so to figure it out from a list of more, I did it recursively, taking the GCD of the cdr of the list first and then finding the GCD of that and the first element (i.e. the car). One alternative might be to take elements of the list two at a time, and repeating the process with a list that gets split in half each time.

I made a slight change to the exercise here, and didn’t even notice that I had done it until writing this up. Instead of modifying gcd-terms, I reduced the coefficients one step higher up, in gcd-poly. This is really a matter of style. To my mind, I don’t see the reduction step as a necessary element of computing gcd-terms, which is to say I’d rather keep it as-is and do the reduction outside of it. An argument could be made that all the GCD calculation should be performed at the lowest level, to keep the polynomial-level procedure simpler. It ultimately doesn’t appear to have an impact on performance of this procedure — modifying gcd-terms directly would simply mean changing the terminal branch from returning just a to returning it with the GCD of the coefficients divided out.

Ex. 2.97.

2.97-a.

I implemented the algorithm with a let at each major step. That allows for a sequence of calculations with each result being labeled. First, the GCD is computed. Next, the integerizing coefficient is calculated, using essentially the same method as in 2.96-a. This allows us to compute the (integerized) numerator and denominator with the GCD divided out. The coefficient GCD is then calculated, using gcd-coeffs from 2.96-b, and the generic GCD operation on the two results of that. This is finally converted to a constant term, and the returned result will have that constant divided out of both the numerator and denominator.

Incidentally, my earlier design decision (not changing gcd-terms to remove common factors) now turns out to have an effect on this procedure’s performance. We do compute the GCD of the terms, but go on to remove the common factors from the fraction at the very end. That means that any earlier scaling down the GCD coefficients is to some extent ‘wasted’ effort. Since I don’t do it in gcd-terms, that will be avoided. On the other hand, the unreduced GCD may then produce a larger integerizing factor in the second step, which will lead to larger numbers in the multiplications. On balance, this is likely a minor performance increase, since mathematical calculations are generally much quicker to perform than running through the extra procedure.

The procedure reduce-poly is almost identical to div-poly, although I did not go so far as to create custom accessors analogous to quotient and remainder.

2.97-b.

Adding a generic operation should come fairly easily now. The top-level procedure calls apply-generic with the operation. Each type that implements the operation needs to create an updated package. For the integers, there isn’t much else needed in the package. I did also create a reduce operation for complex, which is more of a ‘rationalization’ – it turns complex fractions into a single complex value with rational coefficients, by using the complex conjugate.

In calculating the reduced values, I also added in a reduced tag, which isn’t really necessary. It’s use was mostly for debugging, but there could be cases where knowing whether the value you have is reduced or not might help. To keep it from interfering with the existing generic operations, it’s defined to go away on any drop.


I’d like to mention one more thing about the way we chose only one GCD out of many. It’s sort of implied in Exercise 2.95 that P1 could be anything, but consider for example if P1 was 2 x 2 – 6 x – 4. If you run this through the same process, you find that after the reduction step the GCD of Q1 and Q2 is not equal to P1. It will differ by a constant, since this P1 has a common factor in each term, and that gets removed when the GCD is computed. What is the choice we made for the GCD? It’s the one with the smallest (in absolute value) integer coefficients. This has the advantage of ensuring that if we start with integer polynomials, they’ll stay that way, but other reasonable choices can be made (for instance, making a monic polynomial where the leading coefficient is always 1). The ‘correct answer’ here only matches because the first polynomial met the same conditions as our choice for GCD.

Lastly, with these new additions to the system, we’ve added to its complexity in a big way. Previously, we could use rational values in any operation. Now, however, we can’t really be certain of whether that rational value is a numerical or a polynomial fraction. In other words, we would have to develop other features of the system in order to actually make this ‘generic’ system of rationals work. Of course, that’s well beyond the scope of this exercise set, but it’s worth keeping in mind that while the initial addition of functionality was relatively easy to implement thanks to our abstractions, it does not make it any less complex in practice.

Solutions 2.5.3