In this section the generic arithmetic package continues to expand. It will now allow operations that mix the types of numbers allowed. While the system works nearly identically to the one in 2.5.1, there are some new types that the exercises rely on. To avoid crowding the definitions into the exercise file, both the new types and the tests that deal with them have been moved into separate files. These two are called ‘gen-arith252.scm’ and ‘gen-arith-tests_v2.scm’, and are located in the library directory. The tests have been rewritten so that they can apply to the old system (which is used in the first two exercises) and the new ‘tower’ system of hierarchical types.
There is one wrinkle in this: the Integer type will fail some of the arithmetic property tests, because integer division does not always produce an integer result. It would probably be best to further separate the arithemtic property tests into addition and multiplication tests, but since we can run failing tests without exiting the program, I’m overlooking it for now. There is also a hack at the start of the file that lets us use the new tests on the original system. Scheme numbers are used to stand in for Integer and Real types.
Exercise 2.82 deals with operations that require an arbitrary number of arguments. While I did define a new operation to examine how coercion may work, I have not included any that takes more than two arguments. For this case, using the previous tests should suffice.
I’d like to explain a feature of the tower/hierarchical system that I included for Exercises 2.83 and later. In order to have a progression from rationals to reals (and back), an arbitrary ‘rationalization’ can be performed on a real. This allows it to be expressable as a fraction (rational) as long as there is an equivalent (sidenote) ‘equivalent’ as defined by equ? fractional representation that does not require a denominator larger than a given number. The cutoff for denominator size is arbitrary because all values of the Real type, which internally are floating-point values, are actually just high-precision fractions and could thus always be expressed as a Rational.
Implementing this feature requires using getdenom to project a Real in Exercise 2.85. The alternative is to simply disallow dropping of Reals to Rationals. This is acceptable, but be aware that some of the tests I’ve written may fail if that is the case.
As the exercises go on, there are continual changes to the system. Each time we can go back and run the same basic tests. Because the system interface remains the same, the tests continue to work, and this ensures we didn’t wreck anything that was previously working. Typically, a few new tests are added to verify the specific feature just added or modified.
In truth, the number of additional tests I’ve written really do very little to exercise the system, but as this section is fairly intensive, and I’ve already spent far too long working on just the solutions, I didn’t want to add even more tests that would need to be figured out. Hopefully by this point you’re able to identify what features to look for and can write a few tests of your own. It’s never wrong to go back and modify the tests themselves as the system develops and changes.
The procedure magnitude is defined to call apply-generic with the argument given, which happens no matter what the number is. The error occurs because it cannot find an entry for magnitude with the type complex. The newly-added definitions are rather simple. All they do is call magnitude (or real-part, etc.) again. However, this time the argument is different. Whenever apply-generic is used, it strips off the initial type tag. Since complex numbers in the system have two type tags, the second call to magnitude will retrieve the appropriate function for the secondary type (either rectangular or polar).
While it’s not a bad idea to go through the process of calling magnitude manually, we can use a programmatic trace to verify that we’re on the right track with this line of thought. With it enabled, the series of procedure calls is as follows (this is not code, but the trace output with my notations):
We can see that the system goes through two calls to apply-generic for the magnitude, once with a number tagged as complex, and the second time with the number as rectangular. The call with a rectangular type then looks up the version of magnitude that was added with the rectangular package. The third call to magnitude in our trace is not the one defined at the top level, but the one inside install-rectangular-package. (The trace indicates this by marking the source file and line number for the call; I’ve stripped that information out here.) There are also two more function calls that occur in the course of calculating the magnitude, which are still the versions local to the rectangular package.
By modifiying the suggested procedures, we can make a type check for numbers by explicitly using number? instead of checking the type tag. Because we want to store the number as itself, we do not attach the tag in attach-tag but simply return the contents for an actual number. Within contents, we return the thing itself when it is a simple number (in fact, we could probably get away with just returning the contents if they are not a pair). The final change is to type-tag. To keep the system working as before, we still need to indicate the tag of a regular number as a scheme-number.
A final note: These modifications do not actually change any numbers that are made using make-scheme-number, but they do allow them to be used alongside bare numbers. All calculations involving them will now end up returning bare values, however, as show-basic-arithmetic demonstrates.
Since we need equ? to work for each type in the system, we have to consider whether we need a version for each type or if one can cover them all. It turns out to be more sensible to apply different standards of equality based on the type. Here are the three versions:
Rationals are the easiest to explain. A simple version would compare numerator to numerator, and denominator to denominator, and define equality if they both match. However, what about a case like 6/8 compared to 3/4? These are equal values, but they would not be considered equal using that simple version. A better definition of equality allows for non-reduced rationals to be equal to each other, and the easy way to do this is cross-multiply the numerators and denominators: if the products are equal, the numbers are equal.
For ordinary numbers, it might seem easy enough to use the built-in equality operator and be done with it. However, because this is an arithmetic system where we’ll be performing calculations, we may want a slightly different definition of what equality is.
The reason for this has to do with how floating-point values are actually represented as data. A floating-point value is stored as a binary fraction. That fraction may not be able to exactly match the decimal value we want it to. After a calculation, the result might then be represented as a very tiny bit different than the exact expected value. Using a strict definition of equality would make it hard to use equ? in a meaningful way.
It’ll be more useful to define two numbers as equal if they are within some very small range of each other. Ideally we would choose the maximum difference between a number and the decimal values that it approximates, to maintain precision. However, once we start performing calculations, the tiny differences can build up quickly, and the error in the result is difficult to predict. Using an arbitrary larger value will be sufficient for our purposes.
A more complete understanding of floating-point math in binary systems can be found by reading this classic article, or for a shorter and more simplified version, this website .
The complex numbers highlight the need for this approach. Our complex numbers internally use mathematical functions involving floating-point values. If we do not use an approximation comparison, it would be difficult to ever have a rectangular and a polar number that are equal to each other, which is a problem when the system ought not to care which underlying representation is used.
To test equality for complex numbers, the function close-enough? is used. It compares the real parts and imaginary parts separately. By defining equ? at only the level of complex, we can compare any two complex numbers without caring about their secondary type. We could have used magnitude and angle; either way we only need one. This naturally relies on the additional selectors added in Exercise 2.79 to make the complex package work properly.
With equ? defined, it’s pretty easy to see that =zero? works along similar lines. Each type of number will have its own operator in the table.
For ordinary numbers, we merely need to see if it is close enough to 0.
Rationals and complex numbers are slightly different. For a rational number, only the numerator needs to be equal to zero, and so a simple check of that is sufficient. Complex numbers are considered to be zero if they have zero magnitude, and that is the check we make. In my solutions, I used the =zero? operator in both of those definitions. This only works if we know that values returned by the numer and magnitude function will work in the arithmetic system, but since we added ordinary Scheme numbers, we know that they will at this point.
Extra: Expanding on testing and the zero-angle complex number problem
Once equ? and =zero? have been added to the system, it becomes possible to run the full set of tests. For each type of number, we run the basic arithmetic property tests, and then there are a few tests to check on actual computations. This isn’t terribly exhaustive, but it can catch at least some simple problems with the system.
One thing we discover is that the complex numbers (specifically, the rectangular ones) do not pass some of the tests. This problem was already mentioned in 2.4.2. The basic difficulty is this: using the simple conversion method, the real-imaginary number 0 + 0i has an undefined angle value. Any calcuation that relies on computing its angle will cause an error.
My fix was to add a check in angle that determines if the real and imaginary parts are both 0. If that is so, it just returns 0 for the angle.
At this point our complex numbers pass all the tests, but since we know this could be a problem, we ought to add a few tests for this specific situation. Why add more tests when we have some that cover the situation? The primary reason is that the tests themselves provide information on how the system works. Adding a specific test is a reminder that we require this functionality. It also allows for the tests to be changed without worrying that we could be losing a ‘hidden’ test.
Just adding one test can also make us think of others that we may want (sometimes we might not need to add the extra tests, but just the consideration can help in understanding how we actually want the system to work). For example, in this case I went on to add a few tests that describe how zero-valued complex numbers (and also =zero?) should work.
Let’s look at what each of these tests does.
The first test makes sure that polar numbers with both values 0 work as expected. One could argue over whether this test is strictly necessary, since polar numbers are working correctly. In general, values of 0 tend to be problematic, so I’d keep it in. Adding it also made me realize that polar numbers of length 0 can have any angle. There is then the open question of whether such numbers should also be equal to each other, possibly by forcing their angle to 0.
The rectangular numbers are then checked, which is what the fix is intended to cover. But we also want to make sure that the fix works even when we don’t create the number explicitly. A zero-valued number can arise as the result of a calculation, and so the last test is there to catch that. [If you look at the implementation, the calculations actually end up calling the constructors. However, our tests should not make that assumption unless it is a critical feature of the system, and it’s debatable whether that is so.]
Over the course of the book we’ll see situations like this one, in which not all of a system’s behavior is delineated in the text. Finding ways to test it often raises questions that I answer for myself in the tests, but I would encourage you to alter and extend the tests as you see fit, since it is likely you will notice additional aspects that I have not considered.
This is an extra post in which I’ll discuss some more advanced testing procedures, that will be used in the next section. It requires one special type of function not handled in the book, but otherwise only uses concepts we’ve already covered. While this testing system is not as full-featured or as elegant as something like rackunit, it is a definite improvment over a simple list of check functions. Understanding how it works is entirely optional; the tests can be applied without much explanation merely by having the files present.
The first part of the test-functions file is a set of check functions. These work the same as those used in older exercises. The only difference is that the underlying procedures check for failure instead of a true state. These check functions take the opposite of that result, and thus work by ensuring that the false outcome does not hold. The reason for this will be clearer in a moment, but for now let’s compare a few of the failure and check functions.
These three functions check similar operations, and consequently they have a similar format. There is an if statement that uses the particular function for the check, and works with an ‘expected’ and ‘observed’ argument. They all vary in the way that non-failure is reported; you can decide what the merits and failings of each style is. (Note that the check functions don’t use this report, they only return a true or false; the reports will be used elsewhere).
Using true-false check functions is the same thing we’ve done before. They determine whether we have the correct answer or the wrong one. However, very often the problem in our code causes an error to occur, stopping execution immediately. That means it’s only possible to look at one problem at a time, and each issue must be fixed in sequence before continuing. That can make it tougher to figure out exactly what is going wrong, especially in a more complex system. To get around that problem, we need a different sort of function. This new type of function I’ve named using test to distinguish them from the check functions.
This is a test function for equality. The first line assigns testname from nameargs, or uses a default name of test-equal if nameargs is empty. This allows the test name to be optional. We then call a function exec-test to actually perform the test. The second argument, the expected value, is passed via a list, and we use the same failure-checking function for equal? that check-equal had.
To really understand the system, we need to know what that call to exec-test does. Moving on to that procedure, we see this:
The function with-handlers is a special form that takes a list of handler functions and an expression. What it does is execute a given expression within a sort of bubble. Inside this bubble, errors do not lead directly to the program exiting. Program control is instead passed to the handler functions when an error occurs in that expression. Once the handler is done, the with-handlers block can exit normally and the program can proceed.
This is generally known as exception handling (or ‘error handling’) and is a feature of many programming languages. When an error occurs, an object known as an exception is created, which may contain some information about what went wrong. This allows either the interpreter or some other mechanism to decide what to do with it. All the errors you’ve encountered if you use Racket are in fact exceptions, being handled with the ‘default’ handler. While Scheme has never had a formal implementation for exceptions, most variations on the language have done something like Racket’s with-handlers procedure. Without getting into the details, we can see how it works by using these procedures as examples.
The first argument to with-handlers is the list of handler pairs, which are the procedures used to identify and handle the exceptions. (The use of square brackets here is Racket-specific and not worth going into; it’s effectively still a pair). The first element of the pair is an identification procedure that will be given the exception, and return true if we are willing to handle it. The second element is the actual procedure that will be used to handle it. This approach allows different kinds of exception handlers to be processed by different handlers.
Our handler id function catch-all is set up to handle every kind of exception, aside from explicit breaks (so that if an infinite loop occurs, you can still terminate execution). Then we have the actual handler exc-display, which is what gets executed once an exception occurs that we are willing to handle. In our case we want to report the error for our test and continue. The built-in function exn-message lets us get the message associated with the exception, and that’s what we can output to indicate an error in the test.
With our handlers in place, we can get on with the actual execution of a test. This is done by assigning to failure the result when we apply the test procedure using the arguments given. There’s also something special done with the ‘expression under test’ as it is passed to apply: it is executed as a procedure. Looking back at our test functions, we see that this is what ‘observed’ was, and therefore we know it must be a procedure. The reason for doing this is so that the observed value is computed within the with-handlers block. If it were simply passed as an argument, the expression would be evaluated as an argument, outside of the bubble, and we would not gain the benefits of error handling.
This special treatment to ensure execution inside the exception-handling bubble is only used on the one expression. That does make the observed argument unique in the tests. While this was done here merely as a matter of convenience, there could be some value in treating the tests in this fashion. It would enforce the condition that all computations that might result in error are confined to the ‘observed’ section, not the ‘expected’ answer. However, it also makes testing slightly less flexible, as there are situations where it’s more natural and preferable to use computed values for the expected results.
Whatever test-predicate is, it is supposed to return false if nothing went wrong, and may return anything else if failure occurs. This allows for newly-written test functions to report failure in any format desired. Success is merely indicated with a ‘pass’ message. It’s a convention in testing that success should be quiet, and report nothing or nearly nothing, since it’s only the failures that require attention. Tests typically are run repeatedly (even continuously if possible) and generating a lot of ‘success’ messages can make it harder to find the parts that actually need fixing.
Exception handling also allows us to add another type of test: one to ensure that an expected error actually does occur. This can be quite useful, as there are occasionally exercises that require an error on certain improper input.
To see how this works in action, here are some examples from the tests used for the ‘Generic Arithmetic’ system:
We see that each test requires its first argument to be a procedure, and this is accomplished using a lambda expression with no arguments. (A similar approach was used when measuring execution time in Racket). The first two tests also provide the optional name, which is only displayed if a failure occurs. Note that if errors occur we cannot display the test name, since that isn’t provided as part of the exception data.
The second test shown here highlights the potential for problems when only one ‘expected’ value is allowed. If an error occurs in (mul n2 n1), the program execution will be halted. A possible way around that is to make it similar to the format in the last test, which uses test-false and only requires one argument.
What’s important to test is that the two expressions yield identical results. Neither is really being tested more than the other, so using ‘observed’ and ‘expected’ in this manner is arguably inaccurate. On the other hand, adding a test-true wrapper is like adding extra words to the expression, making it slightly harder to see what’s happening. I prefer the expression as given, since it’s more concise. Feel free to modify it if you disagree.
The first file given below is just the test function framework. The second one contains tests used for the next set of exercises. If you are not using Racket, it should be possible to modify the test-functions file and leave the remaining tests as they are (as long as your version of Scheme can handle exceptions).
In this section we’re building up the components of the previous section into a system with a generic interface. This system will be gradually altered over the course of 2.5, but its core functionality will remain the same. As we’ll see, the interface will largely be unchanged as well. This means that our tests can be re-used, or will only require slight alterations, as we go through this section.
The tests are included in a separate file. In this section, they’re also written using a simple test system that I wrote. You can use it with the exercises just by including it in your library directory (or however you choose to organize your files). If you want a better understanding of it, refer to the previous post.
In Exercise 2.79, the advantage of using the test functions is displayed, by contrasting them with the simpler check functions. When we try to do something that the system can’t handle (in this case, comparing a ‘regular’ Scheme number with a rational number), an error occurs. When using the test functions, the error is still indicated but the program can continue. Those check functions will need to be commented out or deleted, as we cannot make this sort of comparison [yet].
If you take a look at the test file for the arithmetic system, you can see that we’re reaping the benefits of the generic interface by using the same set of tests for multiple types of numbers. There’s a test for basic arithmetic properties that is called in this fashion:
By using arguments of the appropriate type, we’re able to apply these tests to any new number type we add to the system. Even as we change the implementation or add features, this set of tests will remain useful to ensure that the number type still satisfies some simple conditions, and should not need to change. This also means that some of the tests will result in an error when the full system has not been implemented. Luckily, our test framework can handle this and continue past them.
There are also a few tests that may or may not be necessary, as they cover aspects not specifically required of the system. Depending on what you think of as appropriate behavior to define, those tests can be included, altered, or ignored.
Finally, there is a flaw in the complex number implementation as written. It should be easy enough to fix, although as an extra exercise try to write some tests to cover the specific problem exhibited.
See the previous post for the required testing files.
The data-directed derivative operation simplifies the deriv procedure, and allows for a more modular system. Instead of having a conditional statement that must be modified every time a new operation is added, all expressions involving operators are handled by the dispatch table. This makes it easy to add new operators to the table and to modify the existing ones by simply changing the table entry.
Numbers and variables are not handled by the dispatch table because the dispatch operation requires expressions in the form of (operator operands), that is to say a list or pair. Changing this would require changing the expression format, so that every number or variable would require a ‘type’ tag, and make the expressions overly complex. This demonstrates that although the data-directed method does allows for greater flexibility, the interface must still remain consistent, and this does impose some restrictions on what can be used in the table.
For part b, I made a slight change to the accessors for the math operators. Rather than operating on an expression, they operate on a list of operands (which is what the dispatch table uses). As with the original operators, these are restricted to two arguments only. It is conceivable that an existing system might impose the restriction that the accessors cannot be changed and they’d have to remain how they were. I leave how to accommodate this as an extended exercise.
In part c, I extended the system using exponentiation according to the power rule (assuming integer exponents). The resulting derivative procedure looks like this:
This relies on the constructors and accessors for exponentiation, which can be defined as desired. The symbol for exponentiation is the only thing then required for insertion into the table, which must be the same used by make-exponentiation in creating expressions. In theory multiple entries in the dispatch table could map to the same operation (e.g. both ‘X’ and ‘*’ could indicate multiplication), but the constructors can only create expressions of one type.
The final part asks what needs to change if the order of arguments to get are altered. Since our table allows for ‘type’ and ‘operator’ to be anything, there will be no problem if the values are stored in the table in that fashion. The only required change is to the installation of the operations (calls to put) so that they match the new ordering.
When I created a rudimentary version of the system in question, I made some assumptions about what information is required, and how it might be presented. You may have interpreted the questions differently, as they are intended to provoke consideration before actually encountering a working system. Multiple systems might well be operating on another set of assumptions entirely. As it is, I’m only able to talk about my approach to the system I created.
The first procedure required is get-record, which will retrieve an employee record from a personnel file. Since the personnel file is said to vary by division, once we have a file we must be able to determine which division it is from in order to process it properly. I made that a hard requirement: the personnel files must have some way that they can be identified, and the means of finding it must be the same for all of them. I call this ‘division-ID’, on the basis that using a unique identifier for each division makes the most sense here as new divisions are added or each one changes its own implementation.
If for some reason the personnel files can’t be changed, one method might be a ‘container file’ that could be created to hold the division ID along with the personnel file. An alternate approach, however, would be to pass a division-specific procedure to get-record along with the file as an argument. Then the files are simply referenced by division in the table. I happen to think this might be a better way than what I went with, but only thought of it while writing up this post. Of course this still requires a unique identifier for the divsion, but the difference is that the identifier can be determined at the company level, rather than stored in the division’s details (imagine what would happen if a new division is added that wants to keep their division name, but it’s identical to an existing one).
Once the division is known, a dispatch table can be used to find which operation to use. This table stores for each personnel file a procedure that will retrieve a given employee’s record, given their name. The exercise kind of implies that an employee’s name is sufficient to identify them throughout a division or the company, i.e. all employee names are unique. Such is the case in my small example.
The second part of the exercise is somewhat confusingly stated. It was already assumed that each division structures its employee records in a different fashion, so I’m not sure what requirements should be placed on that structure now. Unless that was the point of the question. The way I see it, the employee records have no requirements on them other than that they actually have information about the salary. All that is needed is that each division has its own procedure that can determine the salary, given a record. I intentionally did not provide one for the Scranton division in my example, but can be based on what get-record returns, and is just the car of that.
The procedure find-employee-record is perhaps the most straightforward, assuming that get-record returns a value of ‘false’ if no employee is found in that file. Then the procedure can just iterate over the list of files:
It’s get-record that makes use of the data-driven methods, allowing the same procedure call to lead to a (possibly) different eventual method of retrieval for each division.
When a new company (or division) is taken over, suitable procedures for record and salary retrieval are required within the division. Their arguments must match the ‘signature’ already used in the dispatch table. Additionally, the division-id information must be somehow added to the new company’s personnel file, or the aforementioned container approach must be used to store it. This is another reason why I would favor the alternative of storing division files in the table and using the id as a table-level identifier; adding a new division is just a matter of adding the division file and its associated procedures to the table under the proper identifier.
Using message-passing for magnitude-angle complex numbers is fairly easy to implement by following the example of the real-imaginary numbers. Almost everything is the same, except that within the internal dispatch procedure, we use the operations using the magnitude r and angle a to calculate results.
Because this number can receive the same messages as a real-imaginary number, we observe that the exact same procedures can be used to access the values. That makes it possible for all the math operations, and all our previous tests, to function exactly as before. Just as with the dispatch table, or with the previous explicit decisions on which implementation to use, the procedures that make use of the numbers are abstracted from knowing what they’re doing underneath. The difference here is that each number is a self-contained item, that determines ‘for itself’ how to execute the procedure.
One thing that does change with the message-passing numbers is that the numbers can no longer be displayed directly. For this reason, I added a ‘print’ message that allows for them to be displayed in readable format. This also shows a slight advantage to using the checks in the test framework, as they do not rely on anything being printable in order to determine its correctness.
Looking back at the explicit dispatch approach, we see that even though it allows for an abstract interface, it becomes unwieldy when the system is expanded. Adding a new type, in general, requires changing all procedures that might make use of it so that they’re aware of it. Adding an operation requires adding procedures for each type that uses it, and an additional change to the places that dispatch to it, in order for them to decide which type to use.
The data-directed approach lets us add new types by adding entries to the table for that type, which has no effect on the other types in the table. Adding a new operation is essentially the same – the new operation just needs to be added to the table for each type that can use.
With message passing, each new type is its own entity. Types implement all their own operations, and adding types is just a matter of writing up the creation routine for that type. If a new operation is required, however, the type creation routine needs to be rewritten for all types that use it.
Looking at just these alternatives, the message-passing approach is the easiest for adding new types. New operations can be added most easily with the data-directed approach. There’s a slight ceding of control once we are no longer making the dispatch explicit, but the flexibility allowed by these methods makes them easier to use.
In the broader context of object-oriented languages, most of the time what we call objects operate in a message-passing fashion (where an object is the entity that determines how to respond to messages). Most languages make the constructors and message-passing interface a bit easier to deal with, and some even allow for dynamic changes to the allowed operations. Data-directed methods aren’t often used at a high level (or if they are, it’s with a limited set). This is often because there is a degree of care required in making sure the centralized table structure works properly, as a problem there affects every element of the system relying on it.
Now that we’ve developed the abstract system through several sections, we come to more formalized methods of abstracting an interface. The approaches used here are data-directed and message-passing. While these terms themselves don’t see broad usage, the concepts are both used (the latter more than the former) in many modern programming languages. The overarching concept is that of polymorphism. The idea is that the interface may connect to a variety of underlying data ‘types’, and what actually occurs during the operation may depend on how that type is implemented.
In order to actually program with the data-directed approach used in the text, we require some sort of table implementation. As the text mentions, the table relies on a concept that won’t be introduced until the next chapter. The library file ‘data-tables.rkt’ handles this. As it incorporates concepts not yet touched on, it’s provided without additional explanation. I will also mention that this also marks a point where Racket departs a bit from Scheme, and thus the provided file is in fact Racket-specific. (The difference is not a major one, but is directly related to those not-yet-introduced concepts.) The tables can be used with the put and get as defined within the exercise file, which function as the book defines them.
As for testing, we can continue to use the same complex number test file, since the interface remains consistent. Only one exercise deals with the complex numbers, and the other exercises are a bit more open-ended. For testing the derivative system, a similar observation can be made as in Section 2.3.2 — it is difficult to get results in a consistent format. The results are checked by displaying them and we rely on manual inspection, although for a few of them specific tests could be written.
Exercise 2.74 is very open-ended and even a little vague. It asks questions about what information might be required of a system that we don’t have an example of. To this end, I’ve created two alternate approaches that can be tested against your methods of retrieval, and some suggestions on a few tests to try. Since this is an exercise about trying to work with an existing codebase, this may require a bit more work to understand, but can be considered a good practical exercise in trying to comprehend someone else’s code.
The subject of this section is the creation of a looser abstract system for complex numbers that can accomodate both implementations previously used. While the addition of tags requires a bit more computation time, it allows for greater flexibility. We’ll be getting a lot of use out of type tagging, as it’s an easy approach to use in Scheme – determining type is just a matter of taking the car of a list, no matter what the remainder of the structure is.
With regard to testing, we find the same thing as noted for the math operations: no changes to them are required. The tests continue to use the same interface as before, and this has no effect on how the results are calculated or compared.
However, something that has changed, compared to the previous section, is that the error that occurred with zero values has apparently gone away. The error hasn’t really been fixed, though. This can be confirmed by changing the test to use make-from-real-imag when setting up the ‘zero’ vector. What’s been introduced is rather subtle with respect to number systems: the zero value created by (make-from-real-imag 0 0) does not behave the same as a zero value created from make-from-mag-angle, even though they satisfy our definition of equality.
A very rational approach is to say that since there are multiple methods of constructing apparently identical values, we need to have tests that cover them separately, regardless of the underlying implementation. We can create another set of tests to do this, or even a function that generates tests based on a template. However, even with just two constructors, the number of tests required rapidly grows to an unmanageable size. While there are many circumstances in which it is critical to get full coverage of all cases, it is not always feasible or practical to do so. The best thing is to write the best tests you can, and revisit them and modify them to ensure they aren’t completely missing the important cases.
I don’t want to give the impression that testing isn’t worth it. It will still catch an awful lot of the obvious bugs that can crop up even in simple systems. Unless you really are certain that your tests cover every possibility, they are most likely incomplete and imperfect. And as we just saw, even changes that appear to ‘fix’ bugs can simply hide them because of flaws in the test. Testing is almost never a guarantee of working code; it’s a tool for narrowing down the flaws in it. As more flaws are revealed, tests can be added to help avoid a recurrence of the same problem.
As another demonstration of what testing can reveal, I created an additional complex number implementation. This one is very simple, but it is also very wrong by most standards: its constructors completely ignore the values given to them and store nonsense. Even that does not matter, as the accessors always return the same values. Effectively, there is only a single number in this system.
So what happens when we run the tests? The test suite that checks the interface reports failures, as expected. However, at the level of our math operators, all the tests pass just fine! Apparently, this single-number system see here for more satisfies whatever properties we were checking for in the calculations. We could add another check to the calculation tests for this system (ensuring that ‘zero’ is not equal to ‘one’). An important thing to remember is that there can be hidden assumptions when dealing with an abstraction, and the tests need to be sure they can find them. It’s not an easy matter.
Another approach to testing this bad implementation might be to alter how we check the computation results. The reason they do not fail is that the ‘expected’ values are all flawed, since they are using the same abstract interface as the system. We could change the tests to look directly at the value returned, or create our own correct answer directly, like so:
Then we can’t be fooled by errors in the implementation.
Hopefully it is clear why this is not the best approach to take here. We’ve now tied our tests very closely to one particular implementation. Such tests could not be re-used for the previous section, for instance. Make the change to remove tags, and the tests have to be rewritten.
While sometimes there are circumstances where peeking behind the abstraction does end up being necessary, or at least a lot more feasible, that tendency should be avoided. In the interests of requiring less rewriting, it’s better to adhere to the interface. We’ll be more likely to modify the underlying system if we know we don’t need to expend extra effort to change the tests yet again.
SICP 2.4.1 and Unit Testing
April 26, 2015 11:00
There are no exercises for this section, but that’s no reason not to examine the code. Even with it already written and functioning, it’s an opportune moment to talk more about testing. This section happens to be a great example to use. It has the same interface with different underlying implementations, meaning that the same external tests can be used whatever the implementation is.
Up until now we have often tested for correct answers by defining check functions. For instance, we had check-is-element when we wanted to test if something was an element of a set. These check functions would give us a true or false result, although in a few the wrong answer would be returned to help figure out where the error lies (as with check-equal? in 2.3.4 ). Much of the time these functions ended up looking pretty similar to each other, even if they were short to write.
It should come as no surprise that dedicated frameworks have been written to help us with testing. These not only provide us with pre-written functions, but they make it easier to run a whole group at once, and also do a better job of handling errors and detailing test failure.
The one I’ll be using and discussing is RackUnit since at the core my environment is Racket. The expressions will mostly look like Scheme, so it should be fairly easy to comprehend. Most testing frameworks share the same capabilities as RackUnit.
In RackUnit, a check is a special function that tests whether some condition is true. If the condition holds, the test passes and nothing happens. But if it does not hold, a failure is signaled which leads to a report of information on the test (including, in many circumstances, what the expected and actual values were). The program can then continue, typically to other tests. One point where the testing framework differs from our old check functions is that it can also handle errors without exiting. In other words, if any error occurs (whether directly related to the test or not), the program can just stop testing that part and continue. This feature makes it possible to run through many tests at once even with very buggy or incomplete code.
Since multiple tests can be run despite errors, tests related to the same feature of the program this is where the term ‘unit’ comes in are usually organized so that they can be run as a set. RackUnit has the test-case, which is some amount of code that includes a check in it. These test cases can then be grouped into a test-suite, which is just a list of test-cases.
With our tests defined and grouped, we then need some way to run them and report the results. Separating the test definition from the execution allows us to do things like modify the information reported from the tests, or to change our procedures and repeat the tests. For our purposes, we’re just using RackUnit’s text UI, which must be loaded separately (refer to the first couple lines of the test file). This will let us know in the interpreter window how many tests passed and how many failed or had an error, along with the information generated for each test that did not pass. There is also a GUI testrunner, which you’re free to experiment with – it can be used without needing to modify the test definitions. Refer to the RackUnit documentation linked above.
Finally, we can talk about the actual topic of this section. There are two implementations of the complex number system. They both use the same interface. This means that, as long as we stick to that interface, we can use the exact same tests to see if they are working properly.
The first thing we do is define our own custom check function. We’d like an easy way to compare complex numbers to test the results. One issue we face is that our results are often the result of calculations with floating point numbers, more on floating-point math which are rarely exact. Normally RackUnit has this part covered, with check-=. It takes the expected value, actual value, and then some delta which is the largest allowed difference between them. Unfortunately our complex numbers can’t be compared directly with the = operator, so we must write a custom check that will compare them by parts. Custom checks are pretty easy to create for RackUnit; all you really need is a predicate function and a name for the check.
Here’s the code that defines our equality predicate, and the custom check:
Here we’ve defined a complex-= predicate that compares the real and imaginary parts separately to see if they are in some predefined range (complex-delta). Then we just create the check with define-binary-check. Making this a binary check lets us add two arguments for the actual and expected value, so that the test report can include that information in the case of a failure. While it won’t exactly be a serious problem, it’s advisable to ensure the order of the actual and expected arguments are correct – different test frameworks adopt different conventions, and I’m always getting them mixed up.
There are two test suites in the test file. The first suite tests the interface to our complex numbers, using the various defined accessors and constructors. The second one tests the actual math operations to ensure that they too are functioning properly. This is in some ways related to ‘unit’ testing, in which each part of the system might be tested separately, and then the components may be tested at a higher level when they work together (this distinction will be easier to see in future sections).
Something that I also did is put several check functions into one test case. This was deliberate, to allow for some experimentation. It’s typically considered good practice to just have one check/test per test case, in order to have a better idea of precisely what the problem is. Consider what happens when various checks within a case fail or pass (experiment by changing some of the expected values to be wrong).
Something we discover in the tests is that some of our calculation tests (involving 0) end up failing for one of the given implementations. It would be a mistake, however, to think that just because the tests reveal a flaw in a particular approach, that approach is the only one with the problem. This issue affects them both, just in different ways. With no other exercises to do, consider this an exercise: see what sort of tests would reveal the problem for Alyssa’s implementation, and how to fix it in both cases.
One last note: The tests are contained within a separate file from the definitions. This is not only so that they can be used in more than one section, but to keep more of the code within the exercise files as pure Scheme. The file containing the tests is placed in a new directory, the ‘library’, which will contain files that will be used in more than one set of exercises. Be sure that this directory structure exists, or alternately change the load command at the start of any files that reference it.