Archive for the ‘rails’ Category

Rails + Tidy + REXML

Thursday, March 1st, 2007

It wasn’t totally straight forward to get Tidy, REXML and Rails to play together, so I thought I would write down what and how I did it to save time for others.

The reason for doing this is that I get text in (X)HTML format through RSS feeds and I want to make excerpts of it. So given a long text as input I want to make a short extract of it.

After a bit of thinking and googling I figured out that slicing a HTML document after a given amount of characters is not super trivial to do. Because of the tags you need to actually parse the HTML document and keep track of which tags you need to close then reaching the given amount of characters. Luckily for us Mike Burns has already written a function for Truncating HTML in Ruby. Perfect!

However, after adding that piece of code (and unit tests for that of course) you will find out that REXML barfs if the input is not well-formed HTML and naturally having no control of the content of the RSS feeds there is no way you can guarantee that.

Luckily Tidy comes to the rescue. Tidy is a library that corrects invalid HTML. Install the tidy library and then the tidy ruby gem.

gem install tidy

Unfortunately you have to manually set the path to the library before you can use it with

Tidy.path = '/usr/lib/tidylib.so'

If you, like me use an apple laptop for development and linux on the server that path is going to be different between the environments. So what I did was to introduce a constant in the rails environment files. In the config/environments/production.rb file I put:

TIDY_LIB_PATH = '/usr/lib/libtidy.so'

And naturally I set it to the correct path for my powerbook in the config/environments/development.rb file. Then I just do

Tidy.path = TIDY_LIB_PATH

before using Tidy and everything is good.

To make Tidy behave decently you need to set the following options:

  • tidy.options.show_body_only = true – don’t output body and html tags
  • tidy.options.output_xhtml = true – output xhtml
  • tidy.options.wrap = 0 – don’t write newlines all over the place
  • tidy.options.char_encoding = ‘utf8′ – use utf8 to play nice with rails

so in the end this is what I ended up with:

require 'rexml/parsers/pullparser'
require 'tidy'</p>

<p>def make_excerpt
excerpt = slice(tidy_up_html(content), 2000)
end</p>

<p>def tidy_up_html(html)
Tidy.path = TIDY_LIB_PATH</p>

<p>cleaned_up = Tidy.open do |tidy|
tidy.options.show_body_only = true
tidy.options.output_xhtml = true
tidy.options.wrap = 0
tidy.options.char_encoding = 'utf8'
cleaned_up = tidy.clean(html)
cleaned_up
end
end</p>

<p>def slice(string, length, ellipsis = '...')
p = REXML::Parsers::PullParser.new(string)
tags = []
new_len = length
results = ''
while p.has_next? &amp;&amp; new_len &gt; 0
p_e = p.pull
case p_e.event_type
when :start_element
tags.push p_e[0]
results &lt;&lt; &quot;&lt;#{tags.last} #{attrs_to_s(p_e[1])}&gt;&quot;
when :end_element
results &lt;&lt; &quot;&lt;!--#{tags.pop}--&gt;&quot;
when :text
results &lt;&lt; p_e[0].first(new_len)
current_len = new_len
new_len -= p_e[0].length
if new_len &lt; 0</p>

<h1>find next dot</h1>

<p>i = p_e[0].index('.', current_len)
results &lt;&lt; p_e[0].slice(current_len, i-current_len) if i
results &lt;&lt; p_e[0].slice(current_len, p_e[0].length) unless i
results &lt;&lt; ellipsis
end
else
results &lt;&lt; &quot;&lt;!-- #{p_e.inspect} --&gt;&quot;
end
end
tags.reverse.each do |tag|
results &lt;&lt; &quot;&lt;!--#{tag}--&gt;&quot;
end
results
end

I modified Mike Burns’ method so that after the given number of characters has been reached it will still include text until the next ‘.’ character. I figured it’s much nicer with an excerpt that ends with a complete sentence.

Feel free to use this code if you want.

En svensk artikel on Rails

Wednesday, January 31st, 2007

Peter Marklund har skrivit en artikel om Rails i Informators tidning Format:

http://www.informator.se/rails_ger_dig_arbetsgladjen_.aspx

What’s new in Prototype 1.5?

Thursday, January 25th, 2007

Scott Raymond has written a good overview of What’s New in Prototype 1.5 on XML.com. Well worth reading!

acts_as_versioned

Thursday, November 2nd, 2006

Developing with Ruby on Rails is wonderful. I had been dreading implementing versioned objects, which are needed if I want to introduce publicly editable fields. But after a bit of googling I found the acts_as_versioned plugin that does it all for you in one line of code!

Amazing!

class Page < ActiveRecord::Base
  acts_as_versioned
end
The acts_as_versioned at the top of the model class is all that’s needed. It’s so simple it makes you wonder what the catch is, except there is no catch.

Could not find rails (> 0) in the repository

Tuesday, October 24th, 2006

If you get this when trying to install rails with rubygem you apparently need to remove your source cache.

Not totally obvious.

Update: As you can see from the comments re-running the command should solve it for most.

Test your rails helpers

Thursday, October 5th, 2006

This evening I tried to get my test coverage up a bit by making tests of the helper classes using Geoffrey Grosenbach’s Test Your Helpers instructions. But I couldn’t get it to work. Has anyone used it? Does it work? Or is there some other way?

Testing the helpers is the last big thing left to get to the unreachable 100% test coverage. So naturally I want to solve it!

Ruby on Rails on Ubuntu

Wednesday, October 4th, 2006

When using apache2 it’s recommended to use fcgid instead of fastcgi. That requires some changes that are not totally obvious, see the end of Claudio Cicali’s How to install Ruby on Rails on Ubuntu 5.10 post if you, like me, can’t figure it out yourself.

RailsConf Europe

Saturday, September 16th, 2006

I wasn’t there myself but based on David’s Decompressing RailsConf Europe post I wish I had been there. Maybe next year…

If you can’t get RJS to work

Saturday, September 2nd, 2006

It might be because you, like me, have put the following in application.rb:

    before_filter :set_content_type
    def set_content_type
            @headers["Content-Type"] = "text/html; charset=utf-8"
    end
Change that to:
    before_filter :set_content_type
    def set_content_type
        if request.xhr?
          @headers["Content-Type"] = "text/javascript; charset=utf-8"
        else
          @headers["Content-Type"] = "text/html; charset=utf-8"
        end
    end
And your RJS magic will start to work!

Dependent select boxes in rails

Friday, September 1st, 2006

Adam C. Hegedus explains how to do Ajaxed Select Boxes in Rails. There are a few typos in the code snippets but I’m sure you’ll notice them.