The three rules of Ruby Quiz:
1. Please do not post any solutions or spoiler discussion for this quiz until 48 hours have passed from the time on this message. 2. Support Ruby Quiz by submitting ideas as often as you can: http://www.rubyquiz.com/ 3. Enjoy! Suggestion: A [QUIZ] in the subject of emails about the problem helps everyone on Ruby Talk follow the discussion. Please reply to the original quiz message, if you can. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= by Mauricio Fernandez "Secret Agent 00111 is back at the Casino again, playing a game of chance, while the fate of mankind hangs in the balance. Each game consists of a series of favorable events (probability p), terminated by the first occurrence of an unfavorable event (probability q = 1 - p). More specifically, the game is roulette, and the unfavorable event is the occurrence of 0, which has a probability of q = 1 / 37. No one seriously doubts that 00111 will come through again, but the Secret Service is quite concerned about communicating the blow-by-blow description to Whitehall. The bartender, who is a free-lance agent, has a binary channel available, but he charges a stiff fee for each bit sent. The problem perplexing the Service is how to encode the vicissitudes of the wheel[1] so as to place the least strain on the Royal Exchequer." 00111 will be needing the communication device in 48H. Q has decided to give him a programmable gizmo, with a built-in Ruby interpreter, in the hope that 00111 will code/play with it on his own and do his best to return it in the original condition, very unlike all the other gadgets he's been given before. Therefore, Q can enjoy the luxury of coding in Ruby, and he's confident he will be able to complete the task in time. Q's first thought of using Zlib::Deflate to compress the data as shown in the following example: # EVENTS will be a large sequence of boolean events: # true favorable event (00111 wins) # false unfavorable event (00111 loses) # we generate a sample one as follows: EVENTS = (0..1000).map{ rand() > 1.0 / 37 } # compress require 'zlib' string = EVENTS.map{|x| x ? "0" : "1"}.join("") string.size # => 1001 compressed = Zlib::Deflate.deflate(string) compressed.size # => 69 # now 'compressed' is transmitted by the bartender However, Q has been wary of Zlib since he ran into some sort of BufferError when he was installing Mongrel using RubyGems, and he still mourns the sad demise of 00110, which was the direct result of a binary incompatibility between a Ruby build and a third-party extension. Moreover, the MI6 has been spying on the management of the Casino, which faced a similar problem and solved it long ago, and intercepted the following message at the time the Casino's system was being developed: TEST RUN 43 ----------- Original size: 10000 Compressed to: 242 bytes Result using deflate: 462 bytes Q keeps a very detailed log of his activities, so you know he has been reading up on Information Theory, and has found a paper that addresses the very problem he needs to solve: Run-length encodings--S. W. Golomb (1966); IEEE Trans Info Theory 12(3):399[1] He also found some information in the Wikipedia[2], and wrote the following unit tests before feeling indisposed while reading RailsBlob[3]: http://rubyquiz.com/test_00111_gizmo.rb Q has also wrote some code to exert the SecretAgent00111CommunicationGizmo: http://rubyquiz.com/00111.rb Finally, Q left a note with the expected (approximate) output one should get when running 00111.rb: Dear myself, please make sure that ruby 00111.rb writes something like this to stdout before you hand your beautiful creation to the brutish 00111: Probability : 0.972972972972973 Approx. with: 0.957603280698574 Exponent: 4 (p**4 = 1/2; m = 2 ** e = 16) Decoded correctly? yes Original size: 1000 Compressed to: 23 bytes (2.3%) Compare to 57 bytes using deflate... ========================================= Testing buffered encoder/decoder Decoded correctly? (b) yes Decoded correctly? (c) yes Sincerely, Q Unfortunately, it seems Q is not doing any better, but management is sure you will be able to complete Q's code (that is, write 00111_gizmo.rb such that the above unit tests pass and the sample program works) in time, given all the material he left. 00111's mission will require _at least_ the following tests to pass: * test_rle * test_unrle * test_basic_encoding * test_basic_decoding * test_round_trip_probabilistic The other tests are optional (you can choose to comment them, but the sample program won't execute correctly if they fail, though), but you should probably do your best to make them pass: this is an excellent opportunity to prove your skills, and it seems Q's retirement is not too far away... Good luck. PS: you owe me one, man. Such opportunities are rare. ---- [1] http://urchin.earth.li/~twic/Golombs_Original_Paper/ [2] http://en.wikipedia.org/wiki/Golomb_coding [3] http://railsblob.blogspot.com/ |
Q:
Got your memo about needing help with the encoding. The code is below; I've taken a fairly straightforward approach here, since the handheld device you described has more than enough computational power to encode what it needs to quickly enough to get the message through. By the way, one of your unit tests - with the two StringIO objects - gave me an inordinate amount of trouble. I suggest that in the future you take advantage of IO.pipe when faced with a similar situation. Let me know what's up with those retirement plans, and look me up if you ever find yourself on this side of the pond. Your Friend, Marshall Flinkman #! /usr/bin/env ruby require 'io/nonblock' # The SecretAgent00111CommunicationGizmo class implements a way to # send a series of true/false values through a binary channel using # only a relatively small number of bytes. It does this by assuming # that true values are much more likely than false values, and # that therefore one can do a simple run length encoding on the # original series of true/false values and end up with a sequence # of numbers amenable to being compressed by a Rice code. (i.e. by # a Golomb code with the parameter a power of 2) # # The format of a compressed message of integers is: # # * One byte of _exponent_. Our code will use a parameter value of # b = 2 ** _exponent_ # * For each number _n_ that's encoded: # * _d_ '1' bits # * a '0' bit # * _exponent_ bits that as a binary number form the value _r_ # The number _n_ then has the value <tt> n == d * (2 ** exponent) + r </tt> # * Whatever number of '1' bits are necessary to pad the whole # transmission to an even byte boundary # # Bits within a byte are ordered as by _pack("B*")_ -- that is, MSB first. class SecretAgent00111CommunicationGizmo class UndefinedRLE < Exception; end end # We're going to define a bunch of class methods now, # and SecretAgent00111CommunicationGizmo is just too # long to write out over and over again, and I don't # like the look of putting "self." in front of each # method name as I define it, so... class << SecretAgent00111CommunicationGizmo # Run length encode an input array of true/false values. # Note that this array *must* end in a "false" value, or an # exception (SecretAgent00111CommunicationGizmo::UndefinedRLE) # will be thrown. def rle(tfarr) raise self::UndefinedRLE unless false == tfarr.last ans = [] tfarr.inject(0) {|a,tf| if tf a + 1 else ans<<a; 0 end } ans end # Undo the operation of +rle+. Expands a given array of numbers # into an array of true/false values. def unrle(inarr) inarr.map{|x| [[true]*x,false]}.flatten end # This method does the internal work of encoding a # given number by our Rice code into a string of '0' and '1'. # Not really for client use. def bits_for_encoding(x, exponent) a, b = x.divmod(1<<exponent) ['1' * a, sprintf('0%0*b', exponent, b)].join end # Encode a true/false array by first adding a "false" to the end # and then encoding the resulting string of numbers via our Rice # code. def encode(array,exponent) msg = rle(array+[false]).map{|x| bits_for_encoding(x,exponent) }.join msg += '1' * ((- msg.length) % 8) exponent.chr + [msg].pack("B*") end # This method does the internal work to decode a # message in '1' and '0' bits into an array of # integers. Note that the string in msg is changed # by this method, and shortened to whatever sequence # at the end couldn't be decoded. def decode_bits(msg,exponent) ans = [] trim_from = 0 msg.scan(/(1*)0([01]{#{exponent}})/) {|a,b| ans << a.length*(1<<exponent) + b.to_i(2) trim_from = $~.end(0) } msg.slice!(0,trim_from) ans end # Undo the result of +encode+ def decode(bitstring) # It's not polite to mangle someone else's bitstring # so don't use slice! here # Seriously, the sample program stops working if you # mangle bitstring because then it doesn't give the right # input to Decoder.new(StringIO.new(s)) exponent = bitstring.slice(0,1)[0] msg = bitstring[1..-1].unpack("B*")[0] unrle(decode_bits(msg,exponent))[0..-2] end end class SecretAgent00111CommunicationGizmo # Lets subclasses call class methods as their own def method_missing(meth, *args) self.class.send(meth,*args) end # This class is meant for on-the-fly compression to a given # IO channel. class Encoder < SecretAgent00111CommunicationGizmo def initialize(exponent, output) output << exponent.chr @exponent = exponent @output = output @current_byte = '' @current_run = 0 end def <<(tf) if tf @current_run += 1 else send_number(@current_run) @current_run=0 end self end def finish self << false @current_byte += '1' * ((-@current_byte.length)%8) output_maybe end private def output_maybe if @current_byte.length >= 8 out_bits = @current_byte.slice!(0,(@current_byte.length/8)*8) @output << [out_bits].pack("B*") end end def send_number(x) @current_byte += bits_for_encoding(x,@exponent) output_maybe end end # This class is meant for on-the-fly decompression to a given # IO channel. class Decoder < SecretAgent00111CommunicationGizmo attr_reader :exponent def initialize(io) # StringIO isn't really an IO object, so doesn't know about # nonblock. io.nonblock = true if io.respond_to?(:nonblock=) @io = io @initialized = false @remaining = '' end def exponent if not @initialized @exponent = @io.readchar @initialized = true end @exponent end def read exp = exponent ans = [] # Use read with no arguments since read with an integer # argument messes up the bizarre way StringIO is used in # the unit test. Does Q not know about IO.pipe ? buff = @io.read rescue buff = nil while buff and buff.length > 0 @remaining += buff.unpack("B*")[0] ans.concat unrle(decode_bits(@remaining, exp)) buff = @io.read rescue buff = nil end ans end end end __END__ -- s=%q( Daniel Martin -- [hidden email] puts "s=%q(#{s})",s.map{|i|i}[1] ) puts "s=%q(#{s})",s.map{|i|i}[1] |
In reply to this post by James Gray-7
It's my fault this quiz was released late and I think it is a fun
problem. I don't want to cheat anyone out of a chance to solve it, so I'm extending the quiz one week. I will summarize a week from tomorrow. James Edward Gray II |
In reply to this post by James Gray-7
> Q keeps a very detailed log of his activities, so you know he has been reading
> up on Information Theory, and has found a paper that addresses the very problem > he needs to solve: > > Run-length encodings--S. W. Golomb (1966); IEEE Trans Info Theory 12(3):399[1] Gave me the chance to actually read Golomb's paper, which I never had :) [snip] > Unfortunately, it seems Q is not doing any better, but management is sure you > will be able to complete Q's code (that is, write 00111_gizmo.rb such that the > above unit tests pass and the sample program works) in time, given all the > material he left. > > 00111's mission will require _at least_ the following tests to pass: > > * test_rle > * test_unrle > * test_basic_encoding > * test_basic_decoding > * test_round_trip_probabilistic > > The other tests are optional (you can choose to comment them, but the sample > program won't execute correctly if they fail, though), but you should probably > do your best to make them pass: this is an excellent opportunity to prove your > skills, and it seems Q's retirement is not too far away... I'm all for TDD, but a little bit of explanation would have been welcome 1) Golomb's paper mentions m, you use an exponent, such that 2**exponent is m (so we're missing all the fun with M; aren't these the Rice codes?) 2) placing the exponent in a BYTE before the bits is... practical... 3) why does there seem to be an extra 'false' trailing, in (some!) of the bitstreams of the testcases? After I figured out all of that (I happily started with arbitrary m... and with hindsight, (3) kept me confused longer than it should have; tests working on infinite bitstreams before turning to the finite bit-strings (and then byte-strings) would have helped (unfortunately test_decoder_partial is derived from those finite TEST_VECTORS and therefore carries its problems back into the infinite bitstream domain :( ) ) I managed: Probability : 0.972972972972973 Approx. with: 0.957603280698574 Exponent: 4 (p**4 = 1/2; m = 2 ** e = 16) Decoded correctly? yes Original size: 1000 Compressed to: 27 bytes (2.7%) Compare to 72 bytes using deflate... ================================================================================ Testing buffered encoder/decoder Decoded correctly? (b) yes Decoded correctly? (c) yes Now I'll have to figure out something very simple: when is the new deadline of this quiz :P Regards, Kero. |
On Sep 17, 2006, at 7:31 PM, Kero wrote:
> Now I'll have to figure out something very simple: when is the new > deadline > of this quiz :P <laughs> I will release the summary Thursday morning, which means you need to have it in before I start writing sometime Wednesday if you want me to look it over. James Edward Gray II |
>> Now I'll have to figure out something very simple: when is the new
>> deadline >> of this quiz :P > ><laughs> I will release the summary Thursday morning, which means > you need to have it in before I start writing sometime Wednesday if > you want me to look it over. OK, here we go :) Somehow, I dug into the tests and missed test_rle and test_unrle above the TEST_VECTORS. A pity, because they *do* get your mind set onto those trailing false's. Plus, especially test_unrle is trivial to implement. Run like `ruby test_00111_gizmo.rb -n test_unrle` :) Then on with test_basic_en/decoding. Encoder#<< shows directly why the extra false is required for a finite bitstream: the code doesn't do a bloody thing when appending 'true's. In order to get finite bitstrings (36 out of 37 which end with 'true') across the wire, you have to append a 'false'. For consistency *always* append a 'false' when you close the string. and that is precisely what Encoder#finish does. This way, Encoder#<< works for infinite bitstreams, too. And then you need some padding (with ones! so the decoder won't be able to decode them!) for those "practical" bytes. The decoder has a statemachine to see whether its reading bits for the multiplier (:div) or the remainder (:rem, which is a binary encoded number as we all know for these powers-of-two codes; this is not the case for the general Golomb codes, you'd need an extra offset and sometimes one bit less). I had to add a state for the exponent (:exp) in the beginning and move some code around, instead of just reading the exponent in initialize, because at that time the (string)io can still be empty. Note how Decoder#read returns as much as can be processed from the bitstream, up until what is received (dealing with byte boundaries; additional (fewer than eight) bits could be in the bitstream, but not yet available for the decoder; see how "practical" bytes are?) The Gizmo::encode and Gizmo::decode look amazingly like the test_encoder and test_decoder tests. Sorry :) By this time I was still messing with the trailing false, which I had to append to some tests. Apart from that, test_decode_partial and test_round_trip_probabilistic came almost for free. When I finally saw the light, I could remove the trailing 'false' in Gizmo::decode, put the tests back to what they were supposed to be, and immediately after that, Q's 00111.rb ran as it was supposed to. Thanks for the Quiz, taught me Golomb codes and Ruby's StringIO. Oh, and I realize now that test_decoder tests the infinite streams. And then, the code! ----- # Author: Kero van Gelder # Copyright: Kero van Gelder, can be distributed under LGPL require 'stringio' module SecretAgent00111CommunicationGizmo class UndefinedRLE < RuntimeError end def self.unrle(ary) ary.inject([]) {|un, val| un.concat([true] * val) un << false } end def self.rle(ary) result = [] while not ary.empty? nr = 0 while not ary.empty? and ary[0] nr += 1 ary.shift end raise UndefinedRLE if ary.empty? ary.shift result << nr end raise UndefinedRLE if result.empty? result end Log2 = Math.log(2) def self.encode(ary, exponent) io = StringIO.new encoder = Encoder.new(exponent, io) ary.each{|result| encoder << result} encoder.finish io.string end def self.decode(bitstring) ary = Decoder.new(StringIO.new(bitstring)).read ary.pop ary end class Encoder attr_reader :exponent, :io def initialize(exponent, io) @exponent, @io = exponent, io io.print [exponent].pack("c") @ones = 0 @buf = "" # @finished = false end def << bool finished? and raise "Channel closed" if bool @ones += 1 else m = 2**exponent div, rem = @ones.divmod m @buf << ("1" * div) @buf << (("%0#{exponent+1}s" % (rem).to_s(2)).gsub(/ /, "0")) if (blen = @buf.length) >= 8 #p [exponent, div, rem, @buf, blen] bytes = blen / 8 io.print([@buf.slice!(0, bytes * 8)].pack("B*")) end #p io.string @ones = 0 end end def finish self << false @finished = true @buf << ("1" * (8 - @buf.length % 8)) if @buf.length % 8 > 0 io.print([@buf].pack("B*")) end def finished? @finished end end class Decoder attr_reader :io def initialize(io) @io = io @state = :exp @buf = "" @div = 0 @rem = "" end def exponent if @state == :exp # and io.ready? @exponent = io.read(1).unpack("c")[0] @state = :div end @exponent end def read if @state == :exp # and io.ready? @exponent = io.read(1).unpack("c")[0] @state = :div end result = [] @buf << io.read.unpack("B*")[0] while not @buf.empty? #p [@buf, @state, @exponent, @div, @rem] if @state == :div while not @buf.empty? and @buf[0] == ?1 @buf.slice!(0, 1) @div += 1 end @state = :rem unless @buf.empty? end #p [@buf, @state, @div, @rem] if @state == :rem req = exponent + 1 - @rem.length if @buf.length < req @rem << @buf.slice!(0..-1) else @rem << @buf.slice!(0, req) result.concat([true] * (@div * 2**exponent + @rem.to_i(2))) result << false @div = 0 @rem = "" @state = :div end end end result end end end |
In reply to this post by James Gray-7
James Edward Gray II a écrit :
> It's my fault this quiz was released late and I think it is a fun > problem. I don't want to cheat anyone out of a chance to solve it, so > I'm extending the quiz one week. I will summarize a week from tomorrow. > > James Edward Gray II > So here is my solution ... maybe not very optimal, but working ... I just found not so easy to understand what he wanted exactly as an interface ... and found disturbing to give a paper to implement something different (yet similar and obviously inspired by the paper). Pierre 00111_gizmo.rb (4K) Download Attachment |
In reply to this post by James Gray-7
Hi,
here's my solution. It's based on strings of "1"s and "0"s, so the encoding can be done mainly with a format string ("%0#{@exponent+1}B") and the decoding with regular expressions. Regards, Boris ### 00111_gizmo.rb require 'stringio' module SecretAgent00111CommunicationGizmo class ExponentUndefined < Exception end class UndefinedRLE < Exception end def self.decode data events = Decoder.new(StringIO.new(data)).read events.pop # remove last false events end def self.encode events, exponent io = StringIO.new e = Encoder.new(exponent, io) events.each do |result| e << result end e.finish io.string end def self.rle events raise UndefinedRLE.new if events.size == 0 rl = [] truevals = 0 events.each do |result| if result truevals += 1 else rl << truevals truevals = 0 end end raise UndefinedRLE.new if truevals != 0 rl end def self.unrle rl events = [] rl.each {|n| events += [true]*n + [false]} events end class Encoder def initialize exponent, io @exponent = exponent @io = io @events = [] @bits = '' @io << exponent.chr end def << result @events << result if result == false encode_events end end def encode_events n = @events.size - 1 @events = [] @bits << "1" * (n / 2**@exponent) @bits << "%0#{@exponent+1}B" % (n % 2**@exponent) flush_bytes end def flush_bytes # write every 8 bits to the stream: @bits.gsub!(/[01]{8}/) do |byte| @io << byte.to_i(2).chr '' end end def finish @events << false encode_events # fill up remaining byte with "1"s: rest = @bits.size % 8 if rest != 0 @bits << "1" * (8 - rest) end flush_bytes end end class Decoder def bits(string) string.unpack("B*")[0] end def initialize io @io = io @exponent = nil @bits = '' @t = @n = nil end def exponent return @exponent if @exponent e = @io.getc if e @exponent = e else raise ExponentUndefined.new end @exponent end def add_events n = @n n += @t * 2**exponent if @t @t = @n = nil @events += SecretAgent00111CommunicationGizmo.unrle([n]) end def decode_bits loop do found = false if @bits =~ /^(1+)0/ @t = $1.size @bits.sub!(/^(1+)/, '') found = true end re = Regexp.new("^0([01]{#{exponent}})") if @bits =~ re @n = $1.to_i(2) @bits.sub!(re, '') add_events found = true end return unless found end end def read @events = [] begin exponent data = @io.read @bits += bits(data) decode_bits rescue ExponentUndefined # exponent not available yet in stream, try again later end @events end end end |
In reply to this post by Daniel Martin-7
Hi,
Here's my "solution" to the quiz. It passes the mandatory tests but not the StringIO tests (encode and decode_partial). I found it interesting to know about the Golomb coding, but I still can get ideas right about using IO with bits and bytes. This certainly points me to another area of study,... The units tests were very helpful because they covered a lot of different cases (showing a new bug almost each time!) but the probabilistic test case was even more interesting because it detected a bug outside of the test suite. I have heard about QuickCheck which is an automatic testing tool for Haskell (http://www.math.chalmers.se/~rjmh/QuickCheck/). This tool also makes use of random generated test cases and I am wondering if theses ideas could be imported into the Ruby world. Eric. ==================================== class SecretAgent00111CommunicationGizmo class UndefinedRLE < Exception end def self.rle(tries) # raise an error if there are no tries or if the array doesn't end with a false try if (tries.empty? || tries[-1]) raise UndefinedRLE.new("tries empty: #{tries.empty?} - tries ends with: #{tries[-1]}") end # array[0..-2] returns an array with the last element removed tries.map{|x| x ? "1" : "0"}.join("").split("0", -1)[0..-2].inject([]) do |result, tries_row| result << tries_row.size end end def self.unrle(tries) tries.inject([]) do |result, tries_row| tries_row.times {result << true} result << false end end def self.encode(array, exponent) tries, result = rle(array << false), "" tries.inject(result) do |result, tries_number| # divide the tries number with a power of two a, b = tries_number.divmod(2 ** exponent) # the quotient is unary-encoded ('1' repeated a times) # the remainder is binary encoded result << '1' * a + sprintf("0%0*b", exponent, b) end # pad the result with '1's in order to have an appropriate number of bytes result << '1' * (-result.length % 8) # add the exponent used for calculation and pack [exponent].pack("c") + [result].pack("B*") end def self.decode(bitstring) exponent, result = decode_exponent_and_values(bitstring) result end def self.decode_exponent_and_values(bitstring) bitstring = bitstring.unpack("B*")[0] # the exponent is binary encoded on 8 bits exponent = bitstring.slice!(0..7).to_i(2) result = [] # stop analyzing the bit if empty or if anything that's left is padding while(not bitstring.empty? and not bitstring =~ /\A1+$/) # the quotient is unary encoded at the beginning of the bitstring if != 0 coded_quotient = "" bitstring.scan(/(\A1+)0+\w*/){|x| coded_quotient = x[0]} bitstring.slice!(0, coded_quotient.size) quotient = coded_quotient.size # the remainder is binary encoded on exponent + 1 bits remainder = bitstring.slice!(0..exponent).to_i(2) result << quotient * (2 ** exponent) + remainder end return exponent, unrle(result) end class Encoder def initialize(exponent, io) @values = [] @io = io @exponent = exponent end def << value @values << value end def finish @io << SecretAgent00111CommunicationGizmo.encode(@values, @exponent) end end class Decoder attr :exponent def initialize(io) @exponent, @array = SecretAgent00111CommunicationGizmo.decode_exponent_and_values(io.string) end def read @array + [false] end end end |
Powered by Nabble | Edit this page |