13 Oct
Posted by ProCOM
on October 13, 2007 – 12:21 pm - 471 views
If you're new here, you may want to subscribe to my RSS feed. So that you can read the latest updates about Web2.0 tools, Making Money Online, Tips in SEO, Ajax and many more. Thanks for visiting ProgramimiCOM!
In this final part of a three-part series on code blocks and iteration, you’ll learn how to stop an iteration, how to hide setup and clean up, and more. This article is excerpted from chapter eight of the Ruby Cookbook, written by Lucas Carlson and Leonard Richardson (O’Reilly, 2006; ISBN: 0596523696). Copyright © 2006 O’Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O’Reilly Media.
7.8 Stopping an Iteration
Problem
You want to interrupt an iteration from within the code block you passed into it.
Solution
The simplest way to interrupt execution is to use break. A breakstatement will jump out of the closest enclosing loop defined in the current method:
1.upto(10) do |x|
puts x
break if x == 3
end
# 1
# 2
# 3
Discussion
The break statement is simple but it has several limitations. You can’t use breakwithin a code block defined withProc.newor (in Ruby 1.9 and up)Kernel#proc. If this is a problem for you, uselambdainstead:
aBlock = Proc.new do |x|
puts x
break if x == 3
puts x +2
end
aBlock.call(5)
# 5
# 7
aBlock.call(3)
# 3
# LocalJumpError: break from proc-closure
More seriously, you can’t usebreakto jump out of multiple loops at once. Once a loop has run, there’s no way to know whether it completed normally or by usingbreak.
The simplest way around this problem is to enclose the code you want to skip within acatchblock with a descriptive symbolic name. You can thenthrowthe corresponding symbol when you want to jump to the end of thecatchblock. This lets you skip out of any number of nested loops and method calls.
Thethrow/catchsyntax isn’t exception handling—exceptions use araise/rescuesyntax. This is a special flow control construct designed to replace the use of exceptions for flow control (as sometimes happens in Java programs). It’s a bit like an old-style global GOTO, capable of suddenly moving execution to a faraway part of your program. It keeps your code more readable than a GOTO, though, because it’s restricted: athrowcan only jump to the end of a correspondingcatchblock.
The best example of thecatch..throwsyntax is theFind.findfunction described in Recipe 6.12. When you pass a code block intoFind.find, it yields up every directory and file in a certain directory tree. When your code block is given a directory, it can stopfindfrom recursing into that directory by callingFind.prune, which throws a:prunesymbol. Usingbreakwould stop thefindoperation altogether; throwing a symbol letsFind.pruneknow to just skip one directory.
Here’s a simplified view of theFind.findandFind.prunecode:
def find(*paths)
paths.each do |p|
catch(:prune) do
# Process p as a file or directory…
end
# When you call Find.prune you’ll end up here.
end
end
def prune
throw :prune
end
When you callFind.prune, execution jumps to immediately after thecatch(:prune)block.Find.findthen starts processing the next file or directory.
See Also
7.9 Looping Through Multiple Iterables in Parallel
Problem
You want to traverse multiple iteration methods simultaneously, probably to match up the corresponding elements in several different arrays.
Solution
The SyncEnumerator class, defined in the generator library, makes it easy to iterate over a bunch of arrays or otherEnumerableobjects in parallel. Itseach method yields a series of arrays, each array containing one item from each underlyingEnumerableobject:
require ‘generator’
enumerator = SyncEnumerator.new(%w{Four seven}, %w{score years},
%w{and ago})
enumerator.each do |row|
row.each { |word| puts word }
puts ‘—’
end
# Four
# score
# and
# —
# seven
# years
# ago
#—
enumerator = SyncEnumerator.new(%w{Four and}, %w{score seven years ago})
enumerator.each do |row|
row.each { |word| puts word }
puts ‘—’
end
# Four
# score
# —
# and
# seven
# —
# nil
# years
# —
# nil
# ago
# —
You can reproduce the workings of aSyncEnumeratorby wrapping each of yourEnumerableobjects in aGeneratorobject. This code acts likeSyncEnumerator#each, only it yields each individual item instead of arrays containing one item from eachEnumerable:
def interosculate(*enumerables)
generators = enumerables.collect { |x| Generator.new(x) }
done = false
until done
done = true
generators.each do |g|
if g.next?
yield g.next
done = false
end
end
end
end
interosculate(%w{Four and}, %w{score seven years ago}) do |x|
puts x
end
# Four
# score
# and
# seven
# years
# ago
Discussion
Any object that implements the each method can be wrapped in a Generator object. If you’ve used Java, think of aGeneratoras being like a JavaIteratorobject. It keeps track of where you are in a particular iteration over a data structure.
Normally, when you pass a block into an iterator method likeeach, that block gets called for every element in the iterator without interruption. No code outside the block will run until the iterator is done iterating. You can stop the iteration by writing abreak statement inside the code block, but you can’t restart a broken iteration later from the same place—unless you use aGenerator.
Think of an iterator method like each as a candy dispenser that pours out all its candy in a steady stream once you push the button. The Generator class lets you turn that candy dispenser into one which dispenses only one piece of candy every time you push its button. You can carry this new dispenser around and ration your candy more easily.
In Ruby 1.8, theGenerator class uses continuations to achieve this trick. It sets bookmarks for jumping out of an iteration and then back in. When you callGenerator#nextthe generator “pumps” the iterator once (yielding a single element), sets a bookmark, and returns control back to your code. The next time you callGenerator#next, the generator jumps back to its previously set bookmark and “pumps” the iterator once more.
Ruby 1.9 uses a more efficient implementation based on threads. This implementation calls eachEnumerableobject’seach method (triggering the neverending stream of candy), but it does it in a separate thread for each object. After each piece of candy comes out, Ruby freezes time (pauses the thread) until the next time you callGenerator#next.
It’s simple to wrap an array in a generator, but if that’s all there were to generators, you wouldn’t need to mess around withGenerators or evenSyncEnumerables. It’s easy to simulate the behavior ofSyncEnumerablefor arrays by starting an index into each array and incrementing it whenever you want to get another item from a particular array. Generator methods are truly useful in their ability to turn any type of iteration into a single-item candy dispenser.
Suppose that you want to use the functionality of a generator to iterate over an array, but you have an unusual type of iteration in mind. For instance, consider an array that looks like this:
l = [”junk1″, 1, “junk2″, 2, “junk3″, “junk4″, 3, “junk5″]
Let’s say you’d like to iterate over the list but skip the “junk” entries. Wrapping the list in a generator object doesn’t work; it gives you all the entries:
g = Generator.new(l)
g.next # => “junk1″
g.next # => 1
g.next # => “junk2″
It’s not difficult to write an iterator method that skips the junk. Now, we don’t want an iterator method—we want aGeneratorobject—but the iterator method is a good starting point. At least it proves that the iteration we want can be implemented in Ruby.
def l.my_iterator
each { |e| yield e unless e =~ /^junk/ }
end
l.my_iterator { |x| puts x }
# 1
# 2
# 3
Here’s the twist: when you wrap an array in aGeneratoror aSyncEnumerableobject, you’re actually wrapping the array’seachmethod. TheGeneratordoesn’t just happen to yield elements in the same order aseach: it’s actually callingeach, but using continuation (or thread) trickery to pause the iteration after each call toGenerator#next.
By defining an appropriate code block and passing it into theGenerator constructor, you can make a generation object of out of any piece of iteration code—not only theeachmethod. The generator will know to call and interrupt that block of code, just as it knows to call and interrupteachwhen you pass an array into the constructor. Here’s a generator that iterates over our array the way we want:
g = Generator.new { |g| l.each { |e| g.yield e unless e =~ /^junk/ }}
g.next # => 1
g.next # => 2
g.next # => 3
TheGeneratorconstructor can take a code block that accepts the generator object itself as an argument. This code block performs the iteration that you’d like to have wrapped in a generator. Note the basic similarity of the code block to the body of thel#my_iteratormethod. The only difference is that instead of theyield keyword we call theGenerator#yieldfunction, which handles some of the work involved with setting up and jumping to the continuations (Generator#nexthandles the rest of the continuation work).
Once you see how this works, you can eliminate some duplicate code by wrapping thel#my_iteratormethod itself in aGenerator:
g = Generator.new { |g| l.my_iterator { |e| g.yield e } }
g.next # => 1
g.next # => 2
g.next # => 3
Here’s a version of theinterosculatemethod that can wrap methods as well as arrays. It accepts any combination ofEnumerableobjects andMethodobjects, turns each one into aGenerator object, and loops through all theGenerator objects, getting one element at a time from each:
def interosculate(*iteratables)
generators = iteratables.collect do |x|
if x.is_a? Method
Generator.new { |g| x.call { |e| g.yield e } }
else
Generator.new(x)
end
end
done = false
until done
done = true
generators.each do |g|
if g.next?
yield g.next
done = false
end
end
end
end
Here, we passinterosculatean array and aMethodobject, so that we can iterate through two arrays in opposite directions:
words1 = %w{Four and years}
words2 = %w{ago seven score}
interosculate(words1, words2.method(:reverse_each)) { |x| puts x }
# Four
# score
# and
# seven
# years
# ago
See Also
7.10 Hiding Setup and Cleanup in a Block Method
Problem
You have a setup method that always needs to run before custom code, or a cleanup method that needs to run afterwards. You don’t trust the person writing the code (possibly yourself) to remember to call the setup and cleanup methods.
Solution
Create a method that runs the setup code, yields to a code block (which contains the custom code), then runs the cleanup code. To make sure the cleanup code always runs, even if the custom code throws an exception, use a begin/finally block.
def between_setup_and_cleanup
setup
begin
yield
finally
cleanup
end
end
Here’s a concrete example. It adds a DOCTYPE and an HTML tag to the beginning of an HTML document. At the end, it closes the HTML tag it opened earlier. This saves you a little bit of work when you’re generating HTML files.
def write_html(out, doctype=nil)
doctype ||= %{<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
“http://www.w3.org/TR/html4/loose.dtd”>}
out.puts doctype
out.puts ‘<html>’
begin
yield out
ensure
out.puts ‘</html>’
end
end
write_html($stdout) do |out|
out.puts ‘<h1>Sorry, the Web is closed.</h1>’
end
# <!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”
# ”http://www.w3.org/TR/html4/loose.dtd”>
# <html>
# <h1>Sorry, the Web is closed.</h1>
# </html>
Discussion
This useful technique shows up most often when there are scarce resources (such as file handles or database connections) that must be closed when you’re done with them, lest they all get used up. A language that makes the programmer remember these resources tends to leak those resources, because programmers are lazy. Ruby makes it easy to be lazy and still do the right thing.
You’ve probably used this technique already, with the theKernel#openandFile#openmethods for opening files on disk. These methods accept a code block that manipulates an already open file. They open the file, call your code block, and close the file once you’re done:
open(’output.txt’, ‘w’) do |out|
out.puts ‘Sorry, the filesystem is also closed.’
end
Ruby’s standardcgimodule takes thewrite_htmlexample to its logical conclusion.* You can construct an entire HTML document by nesting blocks inside each other. Here’s a small Ruby CGI that outputs much the same document as thewrite_htmlexample above.
#!/usr/bin/ruby
# closed_cgi.rb
require ‘cgi’
c = CGI.new(”html4″)
c.out do
c.html do
c.h1 { ‘Sorry, the Web is closed.’ }
end
end
Note the multiple levels of blocks: the block passed intoCGI#outsimply callsCGI#htmlto generate the DOCTYPE and the<html>tags. The<html>tags contain the result of a call toCGI#h1, which encloses some plain text in<h1>tags. The program produces this output:
Content-Type: text/html
Content-Length: 137
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”
“http://www.w3.org/TR/html4/strict.dtd”>
<HTML><h1>Sorry, the Web is closed. </H1></HTML>
TheXmlMarkup class in Ruby’sbuildergem works the same way: you can write Ruby code that resembles the structure of the document it creates:
require ‘rubygems’
require ‘builder’
xml = Builder::XmlMarkup.new.message(’type’ => ‘apology’) do |b|
b.content(’Sorry, Web Services are closed.’)
end
puts xml
# <message type=”apology”>
# <content>Sorry, Web Services are closed.</content>
# </message>
See Also
7.11 Coupling Systems Loosely with Callbacks
Problem
You want to combine different types of objects without hardcoding them full of references to each other.
Solution
Use a callback system, in which objects register code blocks with each other to be executed as needed. An object can call out to its registered callbacks when it needs something, or it can send notification to the callbacks when it does something.
To implement a callback system, write a “register” or “subscribe” method that accepts a code block. Store the registered code blocks asProcobjects in a data structure: probably an array (if you only have one type of callback) or a hash (if you have multiple types). When you need to call the callbacks, iterate over the data structure andcalleach of the registered code blocks.
Here’s a mixin module that gives each instance of a class its own hash of “listener” callback blocks. An outside object can listen for a particular event by callingsubscribewith the name of the event and a code block. The dispatcher itself is responsible for callingnotifywith an appropriate event name at the appropriate time, and the outside object is responsible for passing in the name of the event it wants to “listen” for.
module EventDispatcher
def setup_listeners
@event_dispatcher_listeners = {}
end
def subscribe(event, &callback)
(@event_dispatcher_listeners[event] ||= []) << callback
end
protected
def notify(event, *args)
if @event_dispatcher_listeners[event]
@event_dispatcher_listeners[event].each do |m|
m.call(*args) if m.respond_to? :call
end
end
return nil
end
end
Here’s aFactoryclass that keeps a set of listeners. An outside object can choose to be notified every time aFactoryobject is created, or every time aFactory object produces a widget:
class Factory
include EventDispatcher
def initialize
setup_listeners
end
def produce_widget(color)
#Widget creation code goes here…
notify(:new_widget, color)
end
end
Here’s a listener class that’s interested in what happens withFactory objects:
class WidgetCounter
def initialize(factory)
@counts = Hash.new(0)
factory.subscribe(:new_widget) do |color|
@counts[color] += 1
puts #{@counts[color]} #{color} widget(s) created since I started watching.
end
end
end
Finally, here’s the listener in action:
f1 = Factory.new
WidgetCounter.new(f1)
f1.produce_widget(”red”)
# 1 red widget(s) created since I started watching.
f1.produce_widget(”green”)
# 1 green widget(s) created since I started watching.
f1.produce_widget(”red”)
# 2 red widget(s) created since I started watching.
# This won’t produce any output, since our listener is listening to
# another Factory.
Factory.new.produce_widget(”blue”)
Discussion
Callbacks are an essential technique for making your code extensible. This technique has many names (callbacks, hook methods, plugins, publish/subscribe, etc.) but no matter what terminology is used, it’s always the same. One object asks another to call a piece of code (the callback) when some condition is met. This technique works even when the two objects know almost nothing about each other. This makes it ideal for refactoring big, tightly integrated systems into smaller, loosely coupled systems.
In a pure listener system (like the one given in the Solution), the callbacks set up lines of communication that always move from the event dispatcher to the listeners. This is useful when you have a master object (like theFactory), from which numerous lackey objects (like theWidgetCounter) take all their cues.
But in many loosely coupled systems, information moves both ways: the dispatcher calls the callbacks and then uses the return results. Consider the stereotypical web portal: a customizable homepage full of HTML boxes containing sports scores, weather predictions, and so on. Since new boxes are always being added to the system, the core portal software shouldn’t have to know anything about a specific box. The boxes should also know as little about the core software as possible, so that changing the core doesn’t require a change to all the boxes.
A simple change to theEventDispatcherclass makes it possible for the dispatcher to use the return values of the registered callbacks. The original implementation ofEventDispatcher#notify called the registered code blocks, but ignored their return value. This version ofEventDispatcher#notifyyields the return values to a block passed in tonotify:
module EventDispatcher
def notify(event, *args)
if @event_dispatcher_listeners[event]
@event_dispatcher_listeners[event].each do |m|
yield(m.call(*args)) if m.respond_to? :call
end
end
return nil
end
end
Here’s an insultingly simple portal rendering engine. It lets boxes register to be rendered inside an HTML table, on one of two rows on the portal page:
class Portal
include EventDispatcher
def initialize
setup_listeners
end
def render
puts ‘<table>’
render_block = Proc.new { |box| puts ” <td>#{box}</td>” }
[:row1, :row2].each do |row|
puts ‘ <tr>’
notify(row, &render_block)
puts ‘ </tr>’
end
puts ‘</table>’
end
end
Here’s the rendering engine rendering a specific user’s portal layout. This user likes to see a stock ticker and a weather report on the left, and a news box on the right. Note that there aren’t even any classes for these boxes; they’re so simple they can be implemented as anonymous code blocks:
portal = Portal.new
portal.subscribe(:row1) { ‘Stock Ticker’ }
portal.subscribe(:row1) { ‘Weather’ }
portal.subscribe(:row2) { ‘Pointless, Trivial News’ }
portal.render
# <table>
# <tr>
# <td>Stock Ticker</td>
# <td>Weather</td>
# </tr>
# <tr>
# <td>Pointless, Trivial News</td>
# </tr>
# </table>
If you want the registered listeners to be shared across all instances of a class, you can makelistenersa class variable, and makesubscribea module method. This is most useful when you want listeners to be notified whenever a new instance of the class is created.
———————————————-
* In Ruby 1.9, a block can itself take a block argument: |arg1, arg2, &block|. This makes methods like
Module#define_method more useful. In Ruby 2.0, you’ll be able to give default values to block arguments.
† Someone could argue that a block isn’t really a closure if it never actually uses any of the context it carries around: you could have done the same job with a “dumb” block, assuming Ruby supported those. For simplicity’s sake, we do not argue this.
* The name lambda comes from the lambda calculus (a mathematical formal system) via Lisp.
* Of course, behind the scenes, your method could just create an appropriate Enumerator and call its collect implemenation.
* But your code will be more maintainable if you do HTML with templates instead of writing it in Ruby code.
—
by O’Reilly Media
Print This Post
Email This Post
Comments RSS
TrackBack Identifier URI
You must be logged in to post a comment.