Quirksand

SICP

SICP 3.5.1 Solutions

September 6, 2020 11:26

Ex. 3.50

To fill in the blanks, let’s take a look at the original (single-stream) implementation of stream-map from the text.

(define (stream-map proc s)
  (if (stream-null? s)
      the-empty-stream
      (cons-stream 
        (proc (stream-car s))
        (stream-map proc (stream-cdr s))
      )
    )
  )

The first thing it checks is if the stream is finished, by using stream-null?. Here we have to make that check, but will only look at one of the streams, using (car argstreams). It’s considered an error in Scheme if the lists provided to map are not the same length, so we can follow that example and surmise that it is not necessary for us to check the other streams given, and just use the first one as a measure. So the first blank is filled with the stream-null? check.

The other branch of the if in the original stream-map builds the mapped stream (using cons-stream) to return, by applying proc to the car of the stream, and then recursively mapping the cdr of the stream. We are still producing a single output stream, so cons-stream doesn’t have to change. The only major difference, then, is that we use apply (since argstreams is a list already) and map to get a list of the elements to work on. The map will put stream-car and stream-cdr in essentially the same place they appear in the single-stream version. The end result is our generic stream-map procedure.

(define (stream-map proc . argstreams)
  (if (stream-null? (car argstreams))
      the-empty-stream
      (cons-stream
       (apply proc (map stream-car argstreams))
       (apply stream-map (cons proc (map stream-cdr argstreams)))
       )
      )
  )

Ex. 3.51

It is not hard to understand what the stream x produces. It has the numbers 0 to 10 in sequence, but by using show, every time a position in the stream is checked, it will also be output to the screen using display-line. The tricky part is in determining when the actual display of the value occurs. For that we need to figure out when show ends up being called.

The definition of x uses stream-map, which we just defined in the last exercise. When does stream-map apply its procedure to the stream members? When the input stream is not null, the procedure constructs a new stream whose first element has proc applied to it. However, cons-stream is a special form, not a procedure. So the application of proc doesn’t necessarily happen right away. Internally, cons-stream uses regular cons to create a pair, consisting of its first argument and a promise of the second argument. When the interpreter gets to this call to cons, the first argument passed to cons-stream to be evaluated, and therefore proc will be applied then. This also depends on delay not evaluating its arguments

That means the definition of x will cause the first execution of show on the first element of the enumerated list being mapped. So the first statement defining the stream will cause 0 to be output on a line, before we do anything with the stream itself.

The next statement is (stream-ref x 5), to get the element at index 5 in the stream. Since the result of the stream is just the numbers in sequence, we know the final output will be 5. But of course, show may get called and print out more numbers to the screen before that happens. Within stream-ref, the moment when this happens will be when stream-cdr is called to get the next element. That results in a force of the promise. In the case of x, the force triggers a recursive call to stream-map with the remainder of the input stream (the enumerated interval). As just explained, it is in the cons-stream step inside stream-map when show is actually evaluated.

This means that we do not expect to apply show to the first term (0) when calling stream-ref, but it will be applied to successive terms. How many elements will be output via display-line? Since show only gets applied when stream-cdr is called and the stream-map to the next element in the interval occurs, we need to figure out how many times that happens. The recursive calls to stream-ref will count down from 5. When n in stream-ref gets down to 1, it will call stream-cdr the last time, and the next time, when n is 0, it calls stream-car (which does not evaluate show). There are a total of 5 calls that force the promise in the cdr, and 5 numbers will be displayed.

It may help to consider an internal view of the stream. After being defined, it is a pair consisting of the first element of the stream, and the remainder of the stream, which is something produced by delay. I will use _promise_ here to avoid getting into the internals of delay, but all that matters is the expression attached to it has presumably not been evaluated. Shown below is pseudo-code, showing the values for the elements instead of their variable name.

 ( 0 .  
   _promise_ : 
     (apply stream-map 
            (cons show 
                  (map stream-cdr 
                       ( ( 0 . __promise__: (stream-enumerate-interval (+ 0 1) 10)) )
                       )
                   )
             )
   )

While this is accurate to what we execute, there’s less extraneous information (and no loss of functionality) if we model it using the single-stream form of stream-map, giving us this pair as the internal representation:

 ( 0 .
   _promise_ : 
   (stream-map show 
               (stream-cdr ( 0  . __promise__: (stream-enumerate-interval (+ 0 1) 10)) )
               )
    )

We have a pair that contains in its cdr a promise to do a stream-map on the remainder of the interval. When that evaluates, stream-cdr is called (as an argument to stream-map), and the promise contained within it gets forced, producing the value of 1 for the car of the new stream. This is just the regular ‘enumerate interval’ stream, however, so show is not called. Within stream-map, when it is constructing the stream as a return value, show is called on the cdr and the number is printed to screen. Each time, the call to stream-ref returns a pair like this; when it is called with n=1 (just before the terminal call), this is what s looks like.

 ( 4 . 
   _promise_ : 
   (stream-map show 
               (stream-cdr ( 4 . __promise__: (stream-enumerate-interval (+ 4 1) 10)) )
               )
   )

The next time stream-ref is called, with n=0, the value 5 will be processed by show (and thus displayed) and is now the car of that pair. That result is returned from stream-ref.

We’re just now getting to the interesting part of this exercise, which is what happens next, involving the action of delay. When we call stream-ref again on x and this time with a value of 7, we might expect that it will also output the numbers 1 through 7 before returning the result of 7. However, what we see is Assuming delay is memoized just two lines of output: 6, then 7, and the final result.

Once we have evaluated the cdr the first time, the stream resembles something like the structure shown below. Although technically the code calling the promise is the same, I’ve marked that it will return an already-computed result, since it will do that instead of evaluating the expression. This return value comes from the call to stream-map that produced the value with 1 at the head. Note that in the case of the exercise, all the sucessive promises that are already evaluated would have this ‘return’ form as well. This shows what it looks like only after the first promise is forced:

 ( 0 .
   _promise_ : 
    *return* (1 . _promise_ : 
         (stream-map proc (stream-cdr ( 1 . _promise_ : (stream-enumerate-interval (+ 1 1) 10)))
         )
  )

What happens is that the memoization of delay prevents values that have already been computed from being displayed again. The mapping that executes show only happens the very first time that element of the stream is accessed. After that, delay is storing the value returned by stream-map, and no longer calls show. Instead it just gives the result. With different definitions of delay we could see different output, although that is more the focus of the next exercise.

As an additional note on this exercise, the footnote in the text mentions that this style of programming is “odious”, as it is difficult to analyze properly and leads to confusing results. The typical term used for mixing something like display and assignment is a “side effect”, and you may hear a warning against having such in your code. The basic idea is that you want predictable and expected behavior to occur when calling a procedure, and not to perform some additional action unrelated to how it was called, or to have it modify something else in ways unclear or unknown to the caller of that procedure. It also gets at the heart of what a particular data structure is. Is the value of a member of this stream x simply “a number” or is it “a number and the display of that number”? If we want to try and define it as the latter, it is not only a bit perplexing to put into words but becomes impractical to implement. A better design in this case would be to consider the stream just as a stream of numbers, and then display it as a separate action if that’s what we want. We should not have the stream itself try to do it for us, even if that might seem more convenient at first.

Ex. 3.52

In this exercise, we’re now looking at setting external variables as a “side effect”. In this stream, the variable sum is an external one, and as we’ll see, setting it in the course of evaluating the terms of the stream produces confusing results. I’m assuming for now that delay is memoized, as the text does initially. We’ll step through a few lines at a time.

(define sum 0)            
(define (accum x)
  (set! sum (+ x sum))
  sum)

So far, we’re fine. sum is still 0, since we haven’t called accum with any values yet.

(define seq (stream-map accum (stream-enumerate-interval 1 20)))

As explained in the previous exercise, the creation of a stream with stream-map causes the first term to be evaluated. That means accum is called with a value of 1, and sum is set to 1.

(define y (stream-filter even? seq))

Now we’re defining a variable using stream-filter. Checking its definition, we can see that it either continues to call stream-cdr (forcing the contents of the stream) or it will use cons-stream to build its result. This filter is testing for even numbers. It is first called with 1 (the stream-car of seq) which leads to the false branch, and the stream-cdr of seq is forced. That causes accum to be run with the next value from the interval (2), and the value computed for the stream seq is 1+2=3, which again is not even, so stream-filter continues. The next time the cdr is forced, accum is called with the value of 3, and the result is 6, which is even, so we create a new stream using that result and the remainder. Note that stream-filter does not trigger an evaluation of accum when creating its new stream, since the first element of seq was already evaluated and stored in the stream, and calling stream-car does not force anything.

(define z (stream-filter (lambda (x) (= (remainder x 5) 0)) seq))

The stream z is similar to y, except we’re filtering based on divisibility by 5 instead of 2. We already know that stream-filter triggers accum when it forces the remainder of the stream. However, since we’re also memoizing the delayed terms, all the values of seq that were previously computed do not cause any additional calls to accum. The first three terms of seq are checked for divisibility by 5, and fail (as they are 1,3,6). A previously-unforced promise needs to be evaluated, so accum is called with the value 4. That results in sum being set to 10, and since that is divisible by 5, we do not evaluate any more terms when constructing stream z.

(stream-ref y 7)

This time, we will find the term in y at index 7. Since y consists of even terms, that will be the 8th even number in the generated sequence. So far we have evaluated seq four times, and the memoized values are 1,3,6,10. That’s two even numbers there already. Thanks to the memoization in delay, each newly evaluated term from the original interval is only added to sum once, so the created stream seq will be the sequence where each term is the previous term plus the index of the sequence seq is the Triangular Numbers. The next even terms that follow are 28, 36, 66, 78, 120, and finally 136, the eighth even member of the sequence. All elements past the 10 will trigger accum, and at the end sum will be 136.

(display-stream z)

Finally, we display the entirety of the stream z. Since the original interval was only 20 terms, this stream will be the first twenty Triangular Numbers, filtered to those that are divisible by 5. That will be the sequence 10,15,45,55,105,120,190, and 210. Since 210 is also the twentieth term in the unfiltered sequence, that is what sum will be when complete.

If more terms had been included that were not divisible by 5, the filter would have exhausted the sequence without displaying any more values for z, and sum could have ended up higher. For example, if the enumerated sequence went to 22, sum would be 253 at the end even though the displayed output would not be any different.

As suggested by the text, all of this depends on delay being memoized. If that isn’t the case, then every new access to the cdr of seq will cause sum to change, and that will affect the values of seq. In fact, since the evaluation occurs each time, the elements of seq will be different every time the remainder is accessed! If, for instance, we repeated the (stream-ref y 7) call, we would get a new result each time.

Just to compare, we can redefine delay to not use memoization (see the streambasic file for an implementation) and run this set of steps again. Up through the point that y is defined, there is no difference between the two. That’s because at that point the remainder of seq has not been forced with any repeated values. After that, when z is defined, the results start to diverge. Every time the sequence seq is processed, it produces a sequence like the Triangular Numbers, but with the values (except for the very first) offset by whatever sum currently happens to be. (The first element doesn’t change because the car of a stream does not rely on delay.) So, when z is created, the unfiltered values of seq will be 1,(6+2)=8,(8+3)=11,(11+4)=15. However, seq will change every time, so examining z again will yield different results, and those will also be different depending on what element of y was most recently evaluated. In short, it becomes impossible to predict what any value of seq will be without knowing the entire history of the calls to it, and that is precisely why we would not want side effects that modify a variable without it being explicit.

I slightly sidestepped the actual question in the text here. It’s not compeletely clear what they are intending to ask. As stated, the question is what would happen if delay had been implemented with a lambda expression “without the optimization provided by memo-proc”. However, if we did use a literal lambda expression, we no longer have a special form. In that implementation, there would technically be no difference to the output from the memoized version. That would result only because all the values in seq would be evaluated at the moment of creation (stream-map would recursively create a list instead of delaying the computation of its remainder), and thus each calculation would only occur once, as in the memoized version. That may be what’s intended here, but the phrasing seems to be asking what would happen if memoization were not occurring. That would give us the utterly confusing behavior described in the previous paragraph. It seems like an oversight that the text does not explain special forms in more detail, as it appears to be a crucial element of this exercise.

Solutions 3.5.1