The Accidental Rubyist

invalid byte sequence in UTF-8

Archive for March 2007

Why my ruby sucks

leave a comment »

I’ve been wondering why my ruby’s so poor and why I am so slow at it. I just read a blog the other day and now I know … the reason is WHY.
Anyway, now i am finally reading the Poignant Guide. Hopefully you’ll see better code in some distant future.


Written by totalrecall

March 22, 2007 at 9:24 pm

Posted in ruby

Quote of the … Year ?

leave a comment »

ie_hack: Internet Explorer is the worst piece of crap to have ever been written, with the possible exception of Windows itself. Since IE is unable to parse proper XML, we have to provide a hack to generate XML that IE‘s limited abilities can handle. This hack inserts a space before the /> on empty tags. Defaults to false.
— Documentation of REXML::Document

Written by totalrecall

March 22, 2007 at 5:51 pm

Posted in ruby

Getting Categories using Unix Blogger

leave a comment »

Aaah! In order for labels to be displayed on Google’s Blogger the scheme needs to be mentioned too.
<category scheme=”; term=”blogger”/>
Now labels will come through while blogging with

Feeling amorous, she looked under the sheets and cried, “Oh, no,
it’s Microsoft!”

Written by totalrecall

March 19, 2007 at 3:35 pm

Posted in unix

Inconsistencies in Unix programs

leave a comment »

I use the command-line all the time, as well as ruby and Vim. I also use regular expressions all the time.
With all the brains going behind the various unices/unixes one would expect consistent handling of regexes across various unix programs on one system.
While vim does take “\d” in place of [0-9], however, it requires escaping of the “+” but not of the “*”.
So i can say “\d*” but i must say “\d\+”
Most other programs (I am using OS X, perhaps the GNU programs are improved) do not recognize “\d” and other such shorter forms.
The other day I found that expr does not understand the “+” at all, even with escaping!
None of the standard unix programs such as grep, sed, expr understand minimal matching, which from my perspective should have been the default.
The escaping of round brackets differs between vim and the unix programs on one hand, and perl/ruby on the other.

For those needing a quick way of doing minimal regexp matching, here’s something in ruby:
ruby -ne ‘if /<title.*?>(.*?)<\/title>/ then puts $1;end’
The first “.*” after title is there becos the string contains single quotes, and i cannot put a single quote within the command being sent to ruby. If I use double quotes around the command, the “$1” is interpreted by the shell.
So then i tried putting this in a program, to which i could pass a regexp and filenames. Since the regexp passed in would have to be substituted ( if /$regexp/) the command would have to be in double quotes, but then the “$1” also gets substituted by the shell! A little delving into the pickaxe got me an answer …

ruby -ne “if /$regexp/ then puts Regexp.last_match(1);end” $*

Save the above as, and call as follows:
./ ‘<title.*?>(.*?)<\/title>’ *.html
“Who is General Failure and why is he reading my hard disk ?”

Written by totalrecall

March 19, 2007 at 12:54 pm

Posted in unix

Simple Drupal Blog Poster (ruby xmlrpc)

leave a comment »

After yesterday’s struggle getting ruby 1.8.5 to do a post with drupal, today i quickly put together a simple drupal poster. It posts a new post, and can update the same.
Its a Simple poster in the sense that it takes a file in the following minimal format:
first line is subject
rest is content.
In the case of editing (updating a post), first line must be POSTID: <postid>.
That’s it — no categories etc.

For a full-fledged drupal/MT poster, please use If you are having problems with ruby and drupal, pls see the previous post. A minor change or 2 is required in the xmlrpc/client.rb. I am posting to an old version of drupal, perhaps the new versions correctly return Content-Type as “text/xml, and not html.
DOWNLOAD drupalpost.rb.
“If you want to travel around the world and be invited to speak at a lot
of different places, just write a Unix operating system.”
(By Linus Torvalds)

Written by totalrecall

March 19, 2007 at 12:44 am

Posted in blogging, ruby

feedvalidator and drupal xmlrpc

leave a comment »

I had hoped to validate feeds for being proper XHTML before pushing them to Google’s Blogger.
Tried feedvalidator on the XML feeds and entries returned by Google’s Blogger. Feedvalidator throws up errors on all Blogger’s XML feeds ! So that’s out.

Spent some hours trying to push an XMLRPC request to Drupal. I tried various of the MetaWeblog/MT commands but my ruby program (1.8.5 as per Hivelogic) kept throwing up the same error:
in `do_rpc’: Wrong content-type (RuntimeError)
from /usr/local/lib/ruby/site_ruby/1.8/xmlrpc/client.rb:382:in `call2′
from /usr/local/lib/ruby/site_ruby/1.8/xmlrpc/client.rb:372:in `call’

Surprisingly, ecto and are connecting and posting fine from here. Could not figure out from’s code as to how it actually makes the call.

My code was as follows:
server = “”, “/xmlrpc.php”)
# # Call the remote server and get our result
result =“mt.supportedMethods”);
#result =“mt.supportedTextFilters”);
#result =“metaWeblog.getRecentPosts”, mymap);
#result =“mt.supportedTextFilters”, mymap)

OK ! Found out more.
I opened client.rb and found that this exception is raised if the return type is “text/html”.
I printed the data returned by the server and found that all the supported methods WERE coming through fine.
The document returned by drupal seems to be xml with an xml header, but the content type was “text/html;charset=utf-8”.
So i suppressed the “raise” in the following block:

if resp[“Content-Type”] != “text/xml”
if resp[“Content-Type”] == “text/html”
raise “Wrong content-type: \n#{data}”
raise “Wrong content-type”

Now the process didn’t complete and had to be aborted.

I checked the output returned by LiveJournal on this page: and it is the exact same format that drupal is returning in.
AND SOOO … i hardcoded the content type of the return value to xml, by adding a line:
resp[“Content-Type”] = “text/xml”
and VOILA! it works!!!
This means that if the returned content type were to even have been:
“text/xml;charset=utf-8” the problem would still have occured since client.rb checks for exactly “text/xml

Written by totalrecall

March 18, 2007 at 12:06 am

Posted in Uncategorized

A ONE-line signature rotator

leave a comment »

If you have a file with quotations, verses or whatever that you would like to print out like the fortune program or rotator, you need do only the following:

1. break the file in to multiple files, one for each quote, or verse.

perl -pe ‘BEGIN {$n=1} printf ” — Author’s name\n” and open STDOUT, “>$ARGV.$n” and $n++ if /^ \d/’ verses.txt

The above line creates files named verses.txt.1 … till as many quotes as there are. The split is based on the regex if /^ \d/ which in my case means a line starting with three spaces and then a number. You will modify this as per your file. Also at the end of each file, an attribution is printed.

2. This line prints a random quote:
cat verses.txt.`jot -r 1 1 298`

jot is a wonderful command that can print random or sequential numbers (and a lot more). This command prints 1 random number between 1 and 298. Thus a random file name is generated.

The same could be accomplished in your favorite programming language (aka ruby) as:
system(“echo verses.txt.%s” % rand(298))
(I know my ruby is poor! … but that’s just one line!)

You can dump that one line into a file, say “”, make it executable with “chmod +x“, put it in your path, and shoot verses off.

Cheap, simple, and free!

(The perl snippet is thanks to this page.)

15. There is no distinction between pleasure and pain, man and woman,
success and failure for the wise man who looks on everything as equal.
— Ashtavakra Gita

Written by totalrecall

March 16, 2007 at 5:25 am

Posted in unix