Life On The Edge With Merb, DataMapper & RSpec

This book is work in progress, your help is requested @ github

Life On The Edge With Merb, DataMapper & RSpec

Foreword

(TODO) the foreword

Preface

This is a collaborative effort to document the features of Merb and DataMapper, while also providing example Merb applications.

Who is this for?

The target reader is someone who has some experience with writing Ruby on Rails applications, and is looking to try out a new framework. So if your looking for the AWDR equivalent for Merb, you may be disappointed as this is a work in progress.

Copyright © 2008 Respective Authors.

Authors


This work is licensed under a Creative Commons Attribution-Noncommercial 2.0 UK: England & Wales License.

Creative Commons License

Source code of the applications are dual licensed under the MIT and GPL licenses:

What's Merb, DataMapper & RSpec?

If you're not living on the edge, you're taking up too much room. - Alice Bartlett

Merb, DataMapper and RSpec are all open source projects that are great for building kick-ass web applications. They are all in active development and although it can be hard, we'll try our best to keep up-to-date.

Merb

Merb is a relatively new web framework with an initial 0.0.1 release in October 2006. Ezra Zygmuntowicz is Merb's creator, and continues to actively develop Merb along with a dedicated development team at Engine Yard and many other community contributors.

Merb has obvious roots and inspiration in the Ruby on Rails web framework. If you know Ruby and have used Rails you're likely to get the hang of Merb quite easily.

While there are similarities, Merb is not Ruby on Rails. There are core differences in design and philosophy. In many areas that Rails chooses to be opinionated, Merb is agnostic - with respect to the ORM, the JavaScript library and template language. The Merb philosophy also disbelieves in having a monolithic framework. Instead, it consists of a number of gems: merb-core, merb-more and merb-plugins. This means that it is possible to pick and choose the functionality you need, instead of cluttering up the framework with non-essential features.

The merb gem installs both merb-core and merb-more; all you need in order to get started straight away. The benefit of this modularity is that the framework remains simple and focused with additional functionality provided by gems.

Thanks to Merb's modularity, you are not locked into using any particular libraries. For example, Merb ships with plugins for several popular ORMs and provides support for both Test::Unit and RSpec.

merb-core alone provides a lightweight framework (a la camping) that can be used to create a simple web app such as an upload server or API provider where the functionality of an all-inclusive framework is not necessary.

DataMapper

DataMapper is an Object-Relational Mapper (ORM) written in Ruby by Sam Smoot. We'll be using DataMapper with Merb. As previously mentioned, Merb does not require the use of DataMapper. You can just as easily use the same ORM as Rails (ActiveRecord) if you prefer.

We have chosen to use DataMapper because of it's feature set and performance. One of the differences between it and ActiveRecord that I find useful is the way database attributes are handled. The schema, migrations and attributes are all defined in one place: your model. This means you no longer have to look around in your database or other files to see what is defined.

While DataMapper has similarities to ActiveRecord, we will be highlighting the differences as we go along.

RSpec

RSpec is a Behaviour Driven Development framework for Ruby. It consists of two main pieces, a Story framework for integration tests and a Spec framework for object tests. Both these components are implemented as Domain Specific Languages which help to make the stories and specs created more readable.

Merb currently supports the Test::Unit and RSpec testing frameworks. Both Merb and Datamapper use the RSpec testing frameworks and so we will be covering some aspects so that you may use it for your own applications.

What About Ruby On Rails?

[Merb is] Harder, Better, Faster, Stronger, to quote Daft Punk - Max Williams

So what's the big deal? We have Ruby on Rails and that's enough, isn't it? There is little doubt that Ruby on Rails has rocked the web application development world. You have to give credit where credit's due, and Ruby on Rails is definitely a great web framework. However, there is no such thing as a one-size fits all solution. Ruby on Rails is opinionated software which provides many benefits such as Convention over Configuration. On the other hand, this also means that Ruby on Rails can be unforgiving if you don't want to do things 'the Rails way'.

Where Rails is opinionated, Merb is agnostic. For example, you can easily use your favourite ORM (ActiveRecord, DataMapper, Sequel) or none at all.
Similarly, you can choose the Javascript library and template language that you are most comfortable with, or that best meets the requirements of your specific project.

If performant were a word, Merb would be it. One of Merb's design mantras is "No code is faster than no code". Merb has super-fast routing and is thread-safe. The core functionality is kept separate from the other plugins and it uses less Ruby 'magic', making it easier to understand and hack.

Rails (and consequently Ruby) has received a lot of criticism for not being suitable for large scale web applications, which isn't necessarily true. Merb has been built from the outset to prove that Ruby is a viable language for building fast and scalable web applications.

At the end of the day it's about choice. There are many new Ruby frameworks springing up, undoubtedly encouraged by the success of Rails. In our opinion, Merb shows the most promise of these.

If you'd like to take a look at some other frameworks these links should get you started:

Communities

i'm going to become rich and famous after i invent a device that allows you to stab people in the face over the internet - <[SA]HatfulOfHollow>

The internet is a scary place, but fortunately the Ruby community is very friendly. All these open source projects rely on the contributions from the community, if something needs fixing consider helping out.

Websites

These are the first places to go for help. Check out the API documentation and see if you can find your answer there.

IRC Channels - freenode.net

If you can't find what you were looking for in the API docs then you could join the respective IRC channel on FreeNode and ask your question in there, you may need to wait for a response.

Mailing Lists

The mailing lists are another good way to get help, the response time isn't as fast as asking in an IRC channel but it can be useful to do a search to see if someone else has had your problem before.

Bug Trackers

Your problem may or may not be a known bug. Search the bug trackers and submit a ticket if it's not there already (don't forget to include a description and test cases, or better yet: a patch!). You may find the ticket is solved in the edge version.

Wikis

Getting Started

XKCD - Compiling

Before we get started I'm going to assume you have the following installed:

What will be covered

The Easy Way

If you're on a *nix operating system then keeping up to date with all the edge versions of these gems can be made really easy by using the Sake tasks.

Merb sake tasks can be found in merb-more repository under tools directory. Sake tasks for DataMapper are in dm-dev repository at http://github.com/dkubb/dm-dev/.

To install Sake tasks run sake -i PATH where PATH is path to Sake tasks file on your local machine. For example,

sake -i ~/dev/opensource/merb/merb-more/tools/merb-dev.rake

To do a fresh clone of all repositories use sake dm:clone and merb:clone, respectively. And then to keep up to date you just need to execute:

sake dm:update

and

sake merb:update

to update Merb and DataMapper gems.

But what you really want is probably to wipe out Merb and DM gems before update, do the update and install new updated gems. Use sake merb:gems:refresh and dm:gems:refresh to do so.

If You're Hardcore

Installing Merb


If you have an older version of Merb (<0.9.2) you should remove all merb and datamapper related gems before continuing. Use gem list to see your installed gems. The following command will uninstall the gem you specify:

sudo gem uninstall the_gem_name

Installing the merb gems should be as simple as:

sudo gem install merb --source http://merbivore.org

or for JRuby:

jruby -S gem install merb mongrel

Unfortunately we are living right on the edge of development so we'll need to get down and dirty with building our own gems from source. Luckily this is much easier than it sounds...

Start by installing the gem dependancies:

sudo gem install rack mongrel json erubis mime-types rspec hpricot \
    mocha rubigen haml markaby mailfactory ruby2ruby

or for JRuby:

jruby -S gem install rack mongrel json_pure erubis mime-types rspec hpricot \
    mocha rubigen haml markaby mailfactory ruby2ruby

Then download the merb source:

git clone git://github.com/sam/extlib.git
git clone git://github.com/wycats/merb-core.git
git clone git://github.com/wycats/merb-plugins.git
git clone git://github.com/wycats/merb-more.git

Then install the gems via rake:

cd extlib ; rake install ; cd ..
cd merb-core ; rake install ; cd ..    
cd merb-more ; rake install ; cd ..
cd merb-plugins; rake install ; cd ..

Note that Merb and DataMappers share Extlib library since after 0.9.3 release of DM. The json_pure gem is needed for merb to install on JRuby (Java implementation of a Ruby Interpreter), otherwise use the json gem as it's faster.

Merb is ORM agnostic, but as the title of this book suggests we'll be using DataMapper. Should you want to stick with ActiveRecord or play with Sequel, check the Merb documentation for install instructions.

Installing DataMapper


DataMapper has spit into the gems dm-core and dm-more, the old datamapper gem is now outdated.

If you have an older version of datamapper, data_objects, or do_mysql, merb_datamapper (< 0.9) you should remove them first.


We will use MySQL in the following example, but you can use either sqlite3 or PostgreSQL, just install the appropriate gem. You will also need to ensure that MySQL is on your system path for the gem to install correctly.

To get the gems from source:

git clone git://github.com/sam/extlib.git
git clone git://github.com/sam/do.git

cd extlib
rake install ; cd ..
cd do
cd data_objects
rake install ; cd ..
cd do_mysql  # || do_postgres || do_sqlite3
rake install

git clone git://github.com/sam/dm-core.git
git clone git://github.com/sam/dm-more.git

cd dm-core ; rake install ; cd ..
cd dm-more
rake install

To update a gem from source, run git pull and rake install again.

Install RSpec

The rspec gem was installed in the Merb section above. However, if you want to grab the source, run one of the following commands:

gem install -r rspec

# or

git clone git://github.com/dchelimsky/rspec.git
cd rspec
rake gem
sudo gem install pkg/rspec-*.gem

Creating an App

One of the best ways to become familiar with a framework is to jump in and get your hands dirty. So now that we've got everything installed, it's time to roll up your sleeves and create a test Merb application.

Merb-more comes with a gem called merb-gen, this gives you a command line tool by the same name which is used for all of your generator needs. You can think of it as script/generate. Running merb-gen from the command line with no arguments will show you all of the generators that are available.

Merb follows the same naming convention for projects as rails, so 'my_test_app' and 'Test2' are valid names but 'T 3' is not (they need to be valid SQL table names).

merb-gen app test

This will generate an empty Merb app, so lets go in and take a look. You'll notice that the directory structure is similar to Rails, with a few differences.

# expected output
RubiGen::Scripts::Generate
  create  log
  create  gems
  create  app
  create  app/controllers
  create  app/helpers
  create  app/views
  create  app/views/exceptions
  create  app/views/layout
  create  autotest
  create  config
  create  config/environments
  create  public
  create  public/images
  create  public/stylesheets
  create  spec
  create  app/controllers/application.rb
  create  app/controllers/exceptions.rb
  create  app/helpers/global_helpers.rb
  create  app/views/exceptions/internal_server_error.html.erb
  create  app/views/exceptions/not_acceptable.html.erb
  create  app/views/exceptions/not_found.html.erb
  create  app/views/layout/application.html.erb
  create  autotest/discover.rb
  create  autotest/merb.rb
  create  autotest/merb_rspec.rb
  create  config/rack.rb
  create  config/router.rb
  create  config/init.rb
  create  config/environments/development.rb
  create  config/environments/production.rb
  create  config/environments/rake.rb
  create  config/environments/test.rb
  create  public/merb.fcgi
  create  public/images/merb.jpg
  create  public/stylesheets/master.css
  create  spec/spec.opts
  create  spec/spec_helper.rb
  create  /Rakefile

Configuring Merb

Before we get the server running, you'll need to edit the init.rb file and un-comment the following line (this is only necessary if you need to connect to a database, which we do in our case):

config/init.rb

use_orm :datamapper

Typing merb now in your command line will start the server.

Loaded DEVELOPMENT Environment...
No database.yml file found in /Users/work/merb/example_one/config, assuming database connection(s) established in the environment file in /Users/work/merb/example_one/config/environments
loading gem 'merb_datamapper' ...
Compiling routes...
Using 'share-nothing' cookie sessions (4kb limit per client)
Using Mongrel adapter

As you can see, however, we did not yet configure the database. Let's create the database.yml file that merb is looking for:

config/database.yml

# This is a sample database file for the DataMapper ORM
development:
   adapter: mysql
   database: test
   username: root
   password: 
   host: localhost
   socket: /tmp/mysql.sock

Don't forget to specify your socket, if you do not know it's location, you can find it by typing:

mysql_config --socket

Starting Merb again shows that everything is running okay.

The following command will give you access to the Merb interactive console:

merb -i

You'll notice Merb runs on port 4000, but this can be changed with flag -p [port number]. More options can be found by typing:

merb --help

You can even run Merb with any application server that supports rack (thin, evented_mongrel, fcgi, mongrel, and webrick):

merb -a thin

If you see a 500 error with the following error message when trying to navigate to localhost:4000 in your browser:

undefined method `match' for Merb::Router:Class - (NoMethodError)

This means Merb has been started outside of your applications root directory.

The Framework

The directory structure of the project created should look like the following. We'll give brief overview of the framework here and go into further details of each component in subsequent chapters.

test
  |--> app
  |--> autotest
  |--> config
  |--> log
  |--> public
  `--> spec

The app folder contains your models, views, controllers and helpers. It also has Parts, they inherit from AbstractController and similar to the old Rails components, but are lightweight and are useful for sidebars, widgets etc.

Mailers, which also inherit from the AbstractController have their own folder where the controllers and views live.

app
  |--> controllers
  |--> models (generated with a model)
  |--> helpers
  |--> mailers (generated with a mailer)
  |--> helpers
  |--> parts (generated with a parts controller)
  `--> views

The config folder has all the configuration files and environments. It's important to edit the init.rb and database.yml files in here before running Merb.

The Merb router, which maps the incoming requests to the controllers is also here. The rack.rb file is the rack handler and you can pass options to merb -a to change rack adapter.

config
  `--> environments

RSpec specs can be found in the spec folder.

spec

In addition to these folders you can have a gem directory, which stores frozen gems (see Freezing Gems for more info), and a lib folder to store other ruby files.

A little blog

What will be covered

In the examples, we'll be developing a small blogging application. It's a good idea to grab the source code from http://github.com/deimos1986/book_mdar/tree/master/code, so you can follow along with the examples.

First of all, let's define some of the functionality we would expect from any blogging application.

We're going to call our app golb. Think of it as a backward blog. Feel free to change the name of your app, but if you do, remember to replace the word golb with the name of your app.

To make a new app we'll use the command

merb-gen app golb

Set up the configuration files for your application, this lets Merb know what gems to load for plugins and generators.

config/init.rb

use_orm :datamapper

use_test :rspec

dependencies "dm-validations"

Now add a config/database.yml file with the following:

---
# This is a sample database file for the DataMapper ORM
development: &defaults
  # These are the settings for repository :default
  adapter:  mysql
  database: golb
  encoding: utf8
  username: root
  password: 
  host:     localhost

  # Add more repositories
  # repositories:
  #   repo1:
  #     adapter:  postgresql
  #     database: sample_development
  #     username: the_user
  #     password: secrets
  #     host:     localhost
  #   repo2:
  #     ...

test:
  <<:       *defaults
  database: golb_test

  # repositories:
  #   repo1:
  #     database: sample_development

production:
  <<:       *defaults
  database: golb_production

  # repositories:
  #   repo1:
  #     database: sample_development

---

Note: DataMapper has a rake task to generate a default database.yml file:

dm:db:database_yaml

You can also put a database URI in development.rb (or other environments) just as easily:

Merb::BootLoader.after_app_loads do
  DataMapper.setup(:default, 'mysql://user:pass@localhost/database')
end

Now we're ready to rock and roll ...

Models

(TODO) - rewrite for DM 0.9 almost done, just need to finish it!

Getting started

Building a model with Merb and DataMapper requires generating a model, specifying attributes (properties), and running a migration to create the database table and all the properties. Generating a model is similar to Rails, as is running a migration. But unlike ActiveRecord, DataMapper does not use migration files to define the model.

Instead, properties are defined in the model itself. This allows you to easily see how your models map to the database and removes the headache of trying to use separate migration files (when you have conflicting or irreversible migrations).

The Model Generator

DataMapper has a model generator just as Rails does:

merb-gen model post

This will make a post model for you, provided that you have defined an ORM and the database golb, in the previous steps.

Note: Sometimes you might prefer to directly create a resource (Model, Controller, View) instead of calling the generator tree times:

merb-gen resource post

Properties

So DataMapper models differ a bit from ActiveRecord models as previously stated. Defining the database columns is achieved with the property method. Add this code to the Post class:

app/models/post.rb

property :id, Integer, :serial => true
property :title,  String, :lazy => false

This creates a primary key (the id property) and the title property of the post model. As we can see, the parameters are the name of the table column followed by the type and finally the options.

Note: We could have also directly set the properties when we called the generator:

merb-gen model post title:string

By default, the lazy attribute is set to false for everything except text fields.

Some of the available options are: (TODO) - cover more properties

:public, :protected, :private, :accessor, :reader, :writer,
:lazy, :default, :nullable, :key, :serial, :field, :size, :length,
:format, :index, :check, :ordinal, :auto_validation, :validates, :unique,
:lock, :track, :scale, :precision

:key          - Set as primary key
:serial       - auto-incrementing key
:lazy         - Lazy load the specified property (:lazy => true).
:default      - Specifies the default value
:field        - Specifies the table column
:nullable     - Can the value be null?
:index        - Creates a database index for the column
:accessor     - Set method visibility for the property accessors. Affects both
                reader and writer. Allowable values are :public, :protected, :private.
:reader       - Like the accessor option but affects only the property reader.
:writer       - Like the accessor option but affects only the property writer.
:protected    - Alias for :reader => :public, :writer => :protected
:private      - Alias for :reader => :public, :writer => :private

(TODO) - talk about accessors and overriding them

DataMapper supports the following properties in the core:

(TODO) - creating your own custom properties

CRUD

Creating

Before a new record is created, be sure you have syncronized your model with the database. In order to do this, load the merb console with:

 merb -i

Then migrate your Post model with:

Post.auto_migrate!

To create a new record, just call the method create on a model and pass it your attributes.

@post = Post.create(:title => 'My first post')

Or you can instantiate an object with #new and save it to the repository later:

@post = Post.new
@post.title = 'My first post'
@post.save

There is also an AR like method to find\_or\_create which attempts to find an object with the attributes provided, and creates the object if it cannot find it:

@post = Post.first_or_create(:title => 'My first post')

There are a couple of different ways to set attributes on a model:

@post.title = 'My first post'
@post.attributes = {:title => 'My first post'}
@post.attribute_set(:title, 'My first post')

Find out if an attribute has been changed (aka is dirty):

@post = Post.first
@post.dirty?
=> false
@post.attribute_dirty?(:title)
=> false
@post.title = 'Changing the title'
@post.dirty?
=> true
@post.attribute_dirty?(:title)
=> true
@post.dirty_attributes
=> Set: {#<Property:Post:title>}

Reading (aka finding)

The syntax for retrieving data from the database is clean an simple. As you can see with the following examples.

Finding a post with one as its primary key is done with the following:

# will raise a DataMapper::ObjectNotFoundError if not found
# use #get to just return nil if not record is found
Post.get!(1)

To get an array of all the records for the post model:

Post.all

To get the first post, with the condition author = 'Matt':

Post.first(:author => 'Matt')

When retrieving data the following parameters can be used:

#   Posts.all :order => 'created_at desc'              # => ORDER BY created_at desc
#   Posts.all :limit => 10                             # => LIMIT 10
#   Posts.all :offset => 100                           # => OFFSET 100
#   Posts.all :includes => [:comments]

If the parameters are not found in these conditions it is assumed to be an attribute of the object.

You can also use symbol operators with the find to further specify a condition, for example:

Posts.all :title.like => '%welcome%', :created_at.lt => Time.now

This would return all the posts, where the tile was like 'welcome' and was created in the past.

Here is a list of the valid operators:

TODO: execute sql via the adaptor.

Updating

Updating attributes has a similar syntax to ARs update_attributes:

@post.update_attributes(:title => 'Opps the title has changed!')

You can also just set attributes and then save:

@post = Post.first
@post.title = 'New Title!'
@post.save

Destroying

You can destroy database records with the method destroy, this work much like AR.

bad_comment = Comment.first
bad_comment.destroy

Associations

Like ActiveRecord, DataMapper has associations which define relationships between models. There is a difference in syntax but the underlying idea is the same. Continuing with the Post model we can see a few of the associations defined:

has n, :comments
belongs_to :author, :class => 'User', :child_key => [:author_id]

The has n syntax is a very flexible way to define associations and the standard way in DataMapper > 0.9. It can be used to model all of ActiveRecord associations plus more. The types of associations currently in DataMapper are:

 # DataMapper 0.9  | ActiveRecord
 has n, :things     # has_many :things
 has 1, :thing     # has_one :thing
 belongs_to :item  # belongs_to :item
 many_to_one :item # belongs_to :item
 has n, :items, :through => Resource # has_and_belongs_to_many :items
 has n, :gizmos, :through => :things  # has_many :gizmos, :through => :things

The has n syntax is more powerful than above, since n is the cardinality of the association, it can be an arbitrary range. Some examples:

has 0..n #=> will have a MIN of 0 records and a MAX of n
has 1..n #=> will have a MIN of 1 record and a MAX of n
has 1..3 #=> will have a MIN of 1 record and a MAX of 3

Pretty straight forward. A few things you should note however, you do not need to specify the foreign key as a property if it's defined in the association.

You also don't have to specify a relationship at all if you don't want to, as models can have one way relationships.

Polymorphic associations

(TODO) -polly assoc

Where is my has\_many :through?!

DataMapper > 0.9 now supports has_many :through. For example, if you have a Post model that has many Categories through the Categorization model you would define these associations:

 class Post
   include DataMapper::Resource

   has n, :categorizations
   has n, :categories, :through => :categorizations

 end

 post = Post.first
 post.categorizations #=> []
 post.categories #=> []
 # to attach a category to a post:
 post.categorizations << Categorization.new(:category => Category.first)
 # or you could just create a Categorization object passing in both category and post:
 Categorization.create(:post => post, :category => Category.first)

Has And Belongs To Many (HABTM)

A has n :through relationship is useful, especially when the join model itself has a lot of information on it. Perhaps a subscription which contains the join date and the users rating for the feed it tracks. Sometimes, however, the join model is very simple, just a table with two id columns.

For this, DataMapper offers an alternative to :through => :models, which is :through => Resource. The use of Resource tells DataMapper to automatically create a join table. So to revisit the previous example:

 class Post
   include DataMapper::Resource

   has n, :categories, :through => Resource

 end

 post = Post.first
 post.categories #=> []
 post.categories << Category.first
 post.save
 post.categories #=> [Category.first]

The join table this would create be called posts_categorizations which would contain the two keys of each post-categorization pair.

Validation

(TODO) - still needs 0.9 love - mostly done now, I think

It's a known fact that users will enter invalid, blank or malicious data into your web app.

We need to guard against user error by validating anything that we need to save out to our persistence layers. Sometimes that means guarding against hack attempts, but most of the time it means guarding against invalid data and accidents.

Both ActiveRecord and DataMapper have a concept called Validations, which is ultimately a set of callbacks which fire right before an object gets saved out to our persistence layer and interrupt things when it detects something awry. To use them in DataMapper, all we have to do is require the gem dm-validations.

require 'dm-validations'

class Post
  include DataMapper::Resource

  property :id, Integer, :serial => true
  property :title, String, :length => 0..255
  property :body, Text
  property :original_uri, String, :length => 0..255
  property :created_at, DateTime
  property :can_be_displayed, Boolean, :default => false

  validates_present :body

end

How many validations do we have on the content of the post class? To someone familiar with ActiveRecord, the answer is obviously one. We have a validation that the body must contain something - that it is present. In fact DataMapper, through dm-validations, has set up four validations for us. When we declare properties like :length => 0..255 as well as declaring the maximum length for the field, it also adds a validation to check that the supplied values will fit within that field. So when we validate our model DataMapper will check we ...

And also, without us having to type anything, that we ...

We can test this by calling valid? on one of our posts:

@post = Post.new
@post.valid?
=> false
@post.title = "A cool story!"
@post.body = "It was a dark and stormy ..."
@post.valid?
=> true

If an object isn't valid, you can access its the errors by calling its errors method.

@post.errors
=> #<DataMapper::Validate::ValidationErrors:0x2537e40 @errors={:body=>["Body must not be blank"]}>

Contextual Validation

A problem arises when your website has users creating content and content being created automatically from scrapers or some sort of automated background process (be it from RSS feeds, an FTP server or a web service). No idiots are involved in the creation of content when it's imported into the system and you likely really want that content to appear in your system. This is where context specific validations come into play.

Contexts let you control which validations run when you perform a particular operation. You might want to make sure that a user enters the title for a blog post in your system, but you don't really want such a check for when that blog post comes in off of your RSS scraping system. Maybe you'd send those imported blog posts into a holding pen somewhere so that they can be rescued later, rather than preventing their save and never importing them in at all.

With ActiveRecord, if you declare a validates\_presence\_of on :title, that's it - game over. The only way to bypass that validation is to save\_without\_validations and that skips all of your validations, rather than just this one.

But with DataMapper and dm-validations , you can check for the validity of an object depending on the circumstance you're in. Here's what that blog post model would look like if we wanted to validate blog posts by idiots, but not from our not-so-idiotic scrapper:

require 'dm-validations'

class Post
  include DataMapper::Resource

  property :id, Integer, :serial => true
  property :title, String, :length => 0..255, :auto_validation => false
  property :body, Text
  property :original_uri, String, :length => 0..255, :auto_validation => false
  property :created_at, DateTime
  property :can_be_displayed, Boolean, :default => false

  # user creation
  validates_present :title, :when => [:default, :display]
  validates_present :body, :when => [:default, :display, :import]

  # automated import
  validates_length :original_uri, :in => 0..255, :when => :import

  # a callback to set can_be_displayed appropriately (more on these later)
  before :save do
    self.can_be_displayed = true if self.valid? :display
  end
end

Running quickly through my sample here, you'll spot a few things. The first is the :auto_validation => false on the title and the original_uri. Because we want to define custom contexts for when we need these properties to be checked, we have to override the ones dm-validations adds by default. The second are the :when => [...] following some of our validations. These define in what situation (or context) these validations will be applied.

To check if a post is valid in a particular context, we pass the context as an argument to valid?. For example @post.valid? :display tells us if the post is valid for displaying. These contexts are also honoured by the save method, allowing us to call @post.save :import after our RSS scrapper has parsed the RSS feed and assigned our variables.

You'll notice that I gave :body a validates\_present for all my contexts. This means that, no matter what, that validation callback will kick in. At present there doesn't appear to be a meta "all" context, which will fire under any circumstances.

Also of note is the can\_be\_displayed boolean and the before :save manual callback I defined. Here, I'm helping myself out later on so that it's easy to pull out valid blog posts that can be displayed without worrying about nil field values and such:

@posts = Post.all(
  :title.not => nil,
  :slug.not => nil,
  :order => :created_at.desc,
  :limit => 10
)

Becomes…

@posts = Post.all(
  :can_be_displayed => true,
  :order => :created_at.desc,
  :limit => 10
)

Pretty sexy, no? I can't off-hand think of a way to get this functionality from ActiveRecord objects without a lot of fuss and bother - perhaps using single-table inheritance and with the validations on the subclasses?

With the proper use of validation contexts, you end up saving yourself a lot of headache and work later on down the line, as well as supporting different scenarios where a post might be valid or might not -- all without having to hack-around. How enterprise-y!

validates_with_method

Another very powerful feature in dm-validations is validates\_with\_method. Think of it as like overloading valid? only with the full power of real validations still there too.

Say, for example, you've got an Event model that needs to make sure the end\_date for the event is greater than the start_date. Wouldn't want to break the laws of physics, so we'd do something like:

class Event < ActiveRecord::Base
  def valid?
    start_time < end_time
  end
end

Yup, it's pretty simple with ActiveRecord. Just toss in our own valid? method and we're done. With DataMapper, things are a touch more complicated, but not difficult, and buy you the full power of dm-validations:

class Event
  include DataMapper::Resource

  # properties here

  validates_with_method :check_times

  def check_times(context = :default)
    if start_time < end_time
      return true
    else
      return [false, 'End time must be after start time']
    end
  end
end

So, a couple of things are going on here. First, we declare that we're going to use our custom validation method check_times for the model. Then comes the method itself. It's a pretty simple method. If our start\_time is before our end\_time, return true as we're valid. Otherwise, it returns an array. The first entry in the array is false, which lets DataMapper know the validation has failed. The second entry is a string, which is added to @event.errors so the user has some idea what has gone wrong.

Of course, this custom validator can also be applied only in certain contexts, just by adding a :when => [...] on the validates_with_method line. This brings us a lot of flexibility, and as we're validating with a ruby method, we can get as complex as we need to specify our behaviour. Much nicer than just overriding valid. It's this functionality which requires the context to be passed in (Although your method can feel free to ignore it).

Migrations

There is a rake task to migrate your models, but be warned these are currently destructive!

rake dm:db:automigrate    # Automigrates all models
rake dm:db:autoupgrade     # Perform non destructive automigration

You can also create databases from the Merb console (merb -i)

Post.auto_migrate!

or

Post.auto_upgrade!

This does the same job as the rake task migrating all your models.

DataMapper.auto_migrate!

Why the two commands? They both do slightly different things.

The first, auto_migrate!, works by dropping the table (if it exists) and all of its data then working out which columns need to exist from the model definition, before finally rebuilding the table in the database. This includes any constraints imposed. For example, :nullable => false will add a NOT NULL to the column definition.

auto_upgrade! on the other hand, creates the table from nothing only if the table isn't there already. If it is there, then it compares the current table to the model. If there are properties in the model not defined as columns in the table, it will add them to the table. It does have some limitations though. It doesn't delete columns, and it can't detect renaming them.

Migration Files

Whilst the preceding commands and tasks can keep the database schema in perfect sync with the models, they can also wipe out any data you might have in the database, or fail to remove columns which are no longer needed. To avoid this, AR style migrations are also supported. These are stored in schema/migrations and are ruby files.

migration(1, :add_homepage_to_comments ) do
  up do
    modify_table :comments do
      add_column :homepage, String, :length => 100, :nullable => true
    end
  end

  down do
    modify_table :comments do
      drop_column :homepage
    end
  end
end

The first line of the file is what identifies the migration, and there are two components to it. The more important one is the name, :add_homepage_to_comments, which must be unique across all the migrations applied to the database. The other parameter, 1 in this case, is the level or order and migrations are applied. This number doesn't have to be unique, although a migration mustn't have a higher number than a migration it depends on. You shouldn't define modifications to a table to happen before that table is made, for example.

The up and down blocks describe the actual behaviour of the migration. up is what happens when the migration is applied, and down happens when the migration is 'undone'. This might not mean undone in the literal sense - if you migrate to remove a column, and add it back in the down migration, while the column will be there, all the data will be lost.

In this case, as the name suggested, we add a homepage column to comments. It's specified much like a property (and should match up with the relevant property in the model.rb file). It also takes many of the same options - essentially all those which are just database features: :length, :nullable are valid, but :private, which is a pure ruby option is not allowed. The down migration is the opposite - it removes the column.

To apply the migrations, there are a couple of rake tasks available through merb_datamapper

rake dm:db:migrate:up                   # migrates the database up
rake dm:db:migrate:down                 # migrates the database down

Which apply or remove all the migrations in turn. Sometimes, you don't want to go all the way up (or down) and so you can also specify a level to migrate to, via VERSION=2 or invoking a task like rake dm:db:migrate:up[2]. For both up and down migrations, the version determines the highest order that will be reflected in the table, either by applying up migrations until the level is complete or applying all the down migrations greater than the given level.

There are a couple of generators to make migrations

merb-gen migration name_of_migration    # an empty migration
merb-gen resource_migration Post        # a migration for the post class

The first creates an empty migration stub with the name defined and an up and down block. The second loads up the class in question from app/models and does it's best to construct the appropriate migration from the properties of the model. It currently doesn't generate anything to do with relationships, however.

Other Misc Things

Callbacks

Callbacks in DataMapper > 0.9 are very powerful. In any DataMapper::Resource you can set before and after callbacks on any instance/class method. There are a couple of different ways to define callbacks:

class Post
  include DataMapper::Resource

  property :id, Integer, :serial => true
  property :title, String, :length => 200

  # before save call the instance method make_permalink
  before :save, :make_permalink

  def make_permalink
    self.title = PermalinkFu.permalink(self.title)
  end

  #callbacks can be defined for any method
  after :publish, :send_message

  def publish
    # do some publishing here
  end

  def send_message
    # email someone here
  end

  # defining a callback on a class method, passing in a block to run before its created.
  before_class_method :create do
    # do something before a record is created
  end

end

Bulk Operations

Sometimes, you have to operate on a large number of records at once, to do exactly the same thing to each of them. The example earlier for deleting old posts via each. It involved several SELECTs and then lots of DELETEs, potentially hundreds, depending on the size of the database. Wouldn't it be nice if you could just go Comment.all(:date.lt => Date.today - 20).destroy! and it would produce an appropriate query to do it in one operation and without loading all those posts which are about to be deleted?

Well, that's what happens. Collections are 'lazily evaluated', which is to say, they don't do anything until they've been 'kicked'. .each, mentioned earlier, is a kicker method. It issues a SELECT appropriate to the conditions. .destroy! is another one, except it issues a DELETE. The other bulk method is update!, which looks like (example taken from the DataMapper source)

Person.all(:age.gte => 21).update!(:allow_beer => true)

This command would update the allow_beer attribute of all people aged 21 or older in the database, all in one UPDATE statement.

Note: ActiveRecord has a well known Model.delete_all class method to erase all table entries. In DataMapper to delete all instances of an Object in the database, you would do Model.all.destroy!

Aggregates

DataMapper by default does not provide aggregator methods, but dm-aggregates in dm-more does. After adding dependency "dm-aggregates" to your merb init.rb file, your resource model will have aggregator methods including count, min, max, avg, and sum. You can pass conditions to any of these aggregator methods the same as Resource.first or Resource.all

Post.count :title.like => "%hello world%"

# you can also do a count on an association:
@post.comments.count

Post.avg(:reads_count)

Post.sum(:comments_count)

Each

Each works as expected, iterating over a number of rows and you can pass a block to it. The difference between Comment.all.each and Comment.each is that instead of retrieving all the rows at once, each works in batches instantiating a few objects at a time and executing the block on them (so is less resource intensive). Each is similar to a finder as it can also take options:

Comment.all.each(:date.lt => Date.today - 20).each do |c|
  c.destroy
end

NB: This isn't currently working in DataMapper. However, it will be reimplemented soon.

Changing the Table Name

You can set the name of the database table in your model if it is called something different by overriding a method in the class:

def default_storage_name
  'list_of_posts'
end

This is only necessary if you are using an already existing database. If you have a lot of tables to rename, consider instead a NamingConvention, detailed later.

TODO: Write NamingConventions section.

Routing

Routing in Merb is similar to Rails, if you take a look at your router.rb file

Strings/Regex

Hashes

Restful Routes

Nested Routes

URLs

(TODO) - Defining routes, and resources (TODO) - Nested routes (TODO) - Namespaces (TODO) - Show routes, merb.show_routes in merb's irb console (merb -i)

Controllers

filters

(TODO) - filters, how the chaining works and :throw

Merb filters are quite powerful, etc..

In Rails:

before_filter :find_post
after_filter

In Merb:

before :login_required, :exclude => [:index, :show]
after  :send_email, :only => :create

skip_before is used to skip a before filter

NB: it's exclude not except

(TODO) - how params get passed in controllers (TODO) - Exception Controller (TODO) - explain provides (TODO) - usecase for a part, explain what they are (possibly comments?) - or a side bar of some sorts (TODO) - Admin controller (TODO) - specify a layout (TODO) - rest (TODO) - content_type (TODO) - flash?

Views

(TODO) - form helpers (TODO) - mention you can use other template languages

Partials

Use the partial method to render a partial from the current directory. If you pass a hash as the second argument the contents will be made available as local variables in the partial.

partial :post, {:comments => @post.comments}

To display the latest posts on our blog's front page, we use the :with and :as arguments to render a collection.

partial :post, :with => @posts, :as => post

Mailers

(TODO) - sending mail (TODO) - mail templates in /views

Authentication

(TODO) - Rolling your own (TODO) - integrating RESTful auth (merbful)

Attachments

(TODO) - attachment_pu (TODO) - image resize/crop (TODO) - downloading

RSpec

RSpec is a testing framework which uses a Domain Specific Language or DSL to provide more human readable test code.

When using stubs with RSpec you can roughly categorise the methods you are going to use into two categories. On one side you have the stub! and should_receive methods which refine what methods you expect to be called with what parameters and potentially what they should return in the case of the test being run. On the other side you have assertions which test the output and value or variables. The should method is primarily used when asserting specific results.

What to test?

(TODO) - how to write good test and what should just trust works

Stories

(TODO) - finish stories section

RSpec Stories are use to replace the specification phase in requirements gathering, in the form of scenarios. So we have both a spec and a integration tests.

Add this line to your app's test environment:

dependency "merb_stories"

Now generate your story:

merb-gen story mystory

Now run your story:

MERB_ENV=test rake story[mystory]

Yes, you must include the square brackets.

Now fill out your story. There are some differences to Rails' versions. The best places to look for help are in the Merb code itself:

spec/public/test/controller _matchers _spec.rb
lib/merb-core/test/helpers
lib/merb-core/test/matchers
To start you off, here are the steps for a simple integration test:

steps_for(:homepage) do
  When("I visit the root") do
    @mycontroller = get("/")
  end
  Then("I should see the home page") do
    @mycontroller.should respond_successfully
    @mycontroller.body.should contain("Hello") 
  end    
end

Spec'ing Models

(TODO) - How to spec models, use example merb/dm test talk through them, mocking

Spec'ing Views

(TODO) - What they should test

For more information, check Merb's wiki

Spec'ing Controllers

Getting started

Testing controllers typically involves stubbing out some methods, making a fake request and then ensuring the right variables are assigned, exceptions are raised and views rendered.

A good start is testing the show action in our Posts controller.

class Posts < Application
  provides :html

  def show
    @post = Post.get!(params[:id])
    render @post

    rescue DataMapper::ObjectNotFoundError
    raise NotFound
  end
end

Our first test will ensure that Post.get!(1) is called when /posts/1 is visited, and when the post exists the response code is 200 OK.

describe Posts, "show action" do
  it "should find post and render show view" do
    Post.should_receive(:get!).with("1")
    controller = get('/posts/1') do |controller|
      controller.stub!(:render)
    end
    controller.should be_successful
  end
end

The first should_receive ensures that Post.get!(1) is called, we could mock out a Post instance to return here, but in this case we're only interested in it being called and not raising an exception.

Next we use the get method to make a request to the controller. The get method yields the controller, allowing us to stub out the render method, as we're not interested in how that behaves. Anything inside the get method's block will be executed before the request is dispatched.

After the request has been dispatched, it returns the controller. Several methods are available to examine the results from the request: body, status, params, cookies, headers, session, response and route.

This test was fairly simple, and it's likely you won't need to such tests if your controllers are as simple as ours. But once you have more than a few lines in your controller, simple response status checks can be useful for ensuring the overall integrity of your app.

A more important test would be ensuring that a 404 is returned when the post cannot be found in the database. When Datamapper cannot find a record it raises Datamapper::ObjectNotFoundError. Merb has several useful exception classes which will set the correct status and then call the relevant action in your Exceptions controller. Raising NotFound will set the status to 404 and then call the not_found action, which can return a much nicer.

it "should return 404 if post doesn't exist" do
  Post.should_receive(:get!).with("1").and_raise(DataMapper::ObjectNotFoundError)
  controller = get('/posts/1')
  controller.status.should == 404
end

Unlike the last test there was no need for us to stub the render method because DataMapper::ObjectNotFoundError is raised before it is reached.

Testing multipart forms

(TODO: Make and example of uploading assets in the simple blog)

The multipart_post method allows you to include files in a fake request. There must however be an actual file to be opened and submitted. If you put the file in the same directory as your spec, use File.dirname(FILE) to ensure the full path is used.

If you are going to open the tempfile which is uploaded, remember to stub out File.open. Watch out though, if you use simply open instead of File.open it won't be the File.open you stubbed out. The other issue here is within the spec we have no way of knowing what the filename of the tempfile is, so we have to assume it's correct and use aninstanceof(String) so any filename is accepted.

(TODO: test code)

describe Posts, "create action" do
  it "should receive file" do
    File.should_receive(:open).with(an_instance_of(String))
    multipart_post("/posts", {:image => File.open(File.join( File.dirname(__FILE__), "picture.jpg"))})
    controller.assigns(:filename).should == "picture.jpg"
  end
end

Your controller would look something like this.

class Posts < Application
  def create
    fp = File.open(params[:image][:tempfile].path)
    @filename = params[:image][:filename]
  end
end

More ways to dispatch a request

There are several other ways to dispatch a request in your test. Look at Merb's Wiki for more information

Caching

(TODO) - session cache (TODO) - Query/Mem cache

Check Merb's wiki for more information.

Gotchas

Merb

The Rails way The Merb way
script/server merb
script/console merb -i
script/generate merb-gen
redirect_to blog_path(@blog) redirect url(:blog, @blog)
respond_to provides :xml, :js, :yaml
format content_type
format.html only_provides :html
render :xml => @post render @post
render :file => 'public/404.html', :status => 404 raise NotFound
logger Merb.logger
before_filter before
render :partial partial
f.text_field :name text_control :first_name
RAILS_ENV Merb.environment

Freezing Gems

As Merb is spilt up into various gems, and it's hard to keep update with each one it's a good idea to freeze them into your application, so an update to one gem doesn't break your app.

The easiest way to freeze a gem is to add -i gems as a command line option to specify the location for the installed gem. And then add the gem as a dependency in your init.rb.

gem install aquarium -i gems

When running this command from the root of your merb application, it will install the gem inside the gem directory

If you want to freeze the version of the gem that you have installed which is from trunk, you'll need to find where your gems are located and pass that parameter to the gem install command.

gem environment gemdir

As I have installed Ruby via port my gem folder is located at /opt/local/lib/ruby/gems/1.8. To freeze the aquarium gem I have from trunk I would need to run:

gem install /opt/local/lib/ruby/gems/1.8/cache/aquarium-0.4.1/ -i gems

If you want to freeze merb itself you need to add this to your init.rb, then run the following:

require 'merb-freezer'

rake freeze:core
rake freeze:more
rake freeze:plugins

Once the merb gem is frozen, you can run merb with frozen-merb. If you want to update your frozen gem version, pass the update parameter to the rake task:

rake freeze:core UPDATE=true

DataMapper

(TODO) - DM / AR diffs

RSpec

Submitting a patch

http://www.gweezlebur.com/2008/2/1/so-you-want-to-contribute-to-merb-core-part-1

(TODO) - example patch

Diffs

(TODO) - where to send diffs

Docs

(TODO) - doc convention

Specs

(TODO) - write specs

Hacking Merb

(TODO) - Hacking Merb

##

Changing the directory structure

# Build the framework paths.
#
# By default, the following paths will be used:
# application:: Merb.root/app/controller/application.rb
# config:: Merb.root/config
# lib:: Merb.root/lib
# log:: Merb.root/log
# view:: Merb.root/app/views
# model:: Merb.root/app/models
# controller:: Merb.root/app/controllers
# helper:: Merb.root/app/helpers
# mailer:: Merb.root/app/mailers
# part:: Merb.root/app/parts
#
# To override the default, set Merb::Config[:framework] in your initialization file.
# Merb::Config[:framework] takes a Hash whose key is the name of the path, and whose
# values can be passed into Merb.push_path (see Merb.push_path for full details).
#
# ==== Note
# All paths will default to Merb.root, so you can get a flat-file structure by doing
# Merb::Config[:framework] = {}
# 
# ==== Example
# {{[
#   Merb::Config[:framework] = {
#     :view => Merb.root / "views"
#     :model => Merb.root / "models"
#     :lib => Merb.root / "lib"
#   }
# ]}}
# 
# That will set up a flat directory structure with the config files and controller files
# under Merb.root, but with models, views, and lib with their own folders off of Merb.root.
class Merb::BootLoader::BuildFramework < Merb::BootLoader
  class << self

    def run
      build_framework
    end

    # This method should be overridden in merb_init.rb before Merb.start to set up a different
    # framework structure
    # DOC: Yehuda Katz FAILED
    def build_framework
      unless Merb::Config[:framework]
        %w[view model controller helper mailer part].each do |component|
          Merb.push_path(component.to_sym, Merb.root_path("app/#{component}s"))
        end
        Merb.push_path(:application,  Merb.root_path("app/controllers/application.rb"))
        Merb.push_path(:config,       Merb.root_path("config"), nil)
        Merb.push_path(:environments, Merb.dir_for(:config) / "environments", nil)
        Merb.push_path(:lib,          Merb.root_path("lib"), nil)
        Merb.push_path(:log,          Merb.log_path, nil)
        Merb.push_path(:public,       Merb.root_path("public"), nil)
        Merb.push_path(:stylesheet,   Merb.dir_for(:public) / "stylesheets", nil)
        Merb.push_path(:javascript,   Merb.dir_for(:public) / "javascripts", nil)
        Merb.push_path(:image,        Merb.dir_for(:public) / "images", nil)        
      else
        Merb::Config[:framework].each do |name, path|
          Merb.push_path(name, Merb.root_path(path.first), path[1])
        end
      end
    end
  end
end

Deploying a Merb Application

"Yes, Rails scales just like everything else scale." - Ezra Zygmuntowicz

The most satisfying experience of building a web application is having others use it. Implementing a robust deployment plan is essential to ensure each release of your project goes off with out a hitch.

The Pieces

Subversion

Version control is an essential piece of any software development cycle. There are several options available, including CVS, Git, Subversion, and many more. This guide assumes you have either a Subversion or Git repo that holds your application.

Nginx

A proxy is required to handle incoming HTTP requests. Here there are several options including Apache, Lighttpd, Swiftiply, and many more. A favorite in the community for performance and simplicity has become Nginx, which we'll use for this example. Developed in Russia by Igor Sysoev, the goal of Nginx is to provide a lightweight, high performance web server.

Capistrano

Merb deployment's base lies with Capistrano. Originally developed to ease the process of pushing Rails applications into production, it has been improved upon to more generally provide automating tasks via SSH on remove servers. It can be used for software installation, application deployment, configuration management, and much more.

Mongrel

For our merb instances, there are a variety of Ruby web servers. The de facto standard has been Mongrel, which has also spawned improved stacks such as Thin and Swiftiply. It is extremely easy to change, so examples for all 3 will be provided.

Preparing Your Production Server

Nginx

The stable version of Nginx at the time of writing is 0.5.35 with the latest development of 0.6.29. There is also a branch that supports a newer balancer for handing out requests. For this example we'll compile the latest development version.

./configure --prefix=/usr/local --with-http_ssl_module

Create a configuration file, this example has been maintained by Ezra and is availble at http://brainspl.at/nginx.conf.txt

# user and group to run as
user  ez ez;

# number of nginx workers
worker_processes  6;

# pid of nginx master process
pid /var/run/nginx.pid;

# Number of worker connections. 1024 is a good default
events {
  worker_connections 1024;
}

# start the http module where we config http access.
http {
  # pull in mime-types. You can break out your config 
  # into as many include's as you want to make it cleaner
  include /etc/nginx/mime.types;

  # set a default type for the rare situation that
  # nothing matches from the mimie-type include
  default_type  application/octet-stream;

  # configure log format
  log_format main '$remote_addr - $remote_user [$time_local] '
                  '"$request" $status  $body_bytes_sent "$http_referer" '
                  '"$http_user_agent" "$http_x_forwarded_for"';

  # main access log
  access_log  /var/log/nginx_access.log  main;

  # main error log
  error_log  /var/log/nginx_error.log debug;

  # no sendfile on OSX
  sendfile on;

  # These are good default values.
  tcp_nopush        on;
  tcp_nodelay       off;
  # output compression saves bandwidth 
  gzip            on;
  gzip_http_version 1.0;
  gzip_comp_level 2;
  gzip_proxied any;
  gzip_types      text/plain text/html text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript;


  # this is where you define your mongrel clusters. 
  # you need one of these blocks for each cluster
  # and each one needs its own name to refer to it later.
  upstream mongrel {
    server 127.0.0.1:4000;
    server 127.0.0.1:4001;
    server 127.0.0.1:4002;
  }

  # the server directive is nginx's virtual host directive.
  server {
    # port to listen on. Can also be set to an IP:PORT
    listen 80;

    # Set the max size for file uploads to 50Mb
    client_max_body_size 50M;

    # sets the domain[s] that this vhost server requests for
    # server_name www.[engineyard].com [engineyard].com;

    # doc root
    root /data/ez/current/public;

    # vhost specific access log
    access_log  /var/log/nginx.vhost.access.log  main;

    # this rewrites all the requests to the maintenance.html
    # page if it exists in the doc root. This is for capistrano's
    # disable web task
    if (-f $document_root/system/maintenance.html) {
      rewrite  ^(.*)$  /system/maintenance.html last;
      break;
    }

    location / {
      # needed to forward user's IP address to rails
      proxy_set_header  X-Real-IP  $remote_addr;

      # needed for HTTPS
      proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect false;
      proxy_max_temp_file_size 0;

      # If the file exists as a static file serve it directly without
      # running all the other rewite tests on it
      if (-f $request_filename) { 
        break; 
      }

      # check for index.html for directory index
      # if its there on the filesystem then rewite 
      # the url to add /index.html to the end of it
      # and then break to send it to the next config rules.
      if (-f $request_filename/index.html) {
        rewrite (.*) $1/index.html break;
      }

      # this is the meat of the rails page caching config
      # it adds .html to the end of the url and then checks
      # the filesystem for that file. If it exists, then we
      # rewite the url to have explicit .html on the end 
      # and then send it on its way to the next config rule.
      # if there is no file on the fs then it sets all the 
      # necessary headers and proxies to our upstream mongrels
      if (-f $request_filename.html) {
        rewrite (.*) $1.html break;
      }

      if (!-f $request_filename) {
        proxy_pass http://mongrel;
        break;
      }
    }

    error_page   500 502 503 504  /500.html;
    location = /500.html {
      root   /data/ez/current/public;
    }
  }

  # This server is setup for ssl. Uncomment if 
  # you are using ssl as well as port 80.
  server {
    # port to listen on. Can also be set to an IP:PORT
    listen 443;

    # Set the max size for file uploads to 50Mb
    client_max_body_size 50M;

    # sets the domain[s] that this vhost server requests for
    # server_name www.[engineyard].com [engineyard].com;

    # doc root
    root /data/ez/current/public;

    # vhost specific access log
    access_log  /var/log/nginx.vhost.access.log  main;

    # this rewrites all the requests to the maintenance.html
    # page if it exists in the doc root. This is for capistrano's
    # disable web task
    if (-f $document_root/system/maintenance.html) {
      rewrite  ^(.*)$  /system/maintenance.html last;
      break;
    }

    location / {
      # needed to forward user's IP address to rails
      proxy_set_header  X-Real-IP  $remote_addr;

      # needed for HTTPS
      proxy_set_header X_FORWARDED_PROTO https;

      proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect false;
      proxy_max_temp_file_size 0;

      # If the file exists as a static file serve it directly without
      # running all the other rewite tests on it
      if (-f $request_filename) { 
        break; 
      }

      # check for index.html for directory index
      # if its there on the filesystem then rewite 
      # the url to add /index.html to the end of it
      # and then break to send it to the next config rules.
      if (-f $request_filename/index.html) {
        rewrite (.*) $1/index.html break;
      }

      # this is the meat of the rails page caching config
      # it adds .html to the end of the url and then checks
      # the filesystem for that file. If it exists, then we
      # rewite the url to have explicit .html on the end 
      # and then send it on its way to the next config rule.
      # if there is no file on the fs then it sets all the 
      # necessary headers and proxies to our upstream mongrels
      if (-f $request_filename.html) {
        rewrite (.*) $1.html break;
      }

      if (!-f $request_filename) {
        proxy_pass http://mongrel;
        break;
      }
    }
    error_page   500 502 503 504  /500.html;
    location = /500.html {
      root   /data/ez/current/public;
    }
  }
}

Building a Deployment Recipe with Capistrano

Install Capistrano.

gem install capistrano

Navigate to your Merb repository directory and run the capify command to create the skeleton for your deployment recipe.

$ capify .
[add] writing `./Capfile'
[add] writing `./config/deploy.rb'
[done] capified!

Tailor your deploy.rb to meet the requirements of your application

set :application, "YOUR_APPLICATION_NAME"

# Set the path to your version control system (Subversion assumed)
set :repository, "http://something.com/svn/yourapplication/trunk"

# Set your SVN and SSH User
set :user, "your_ssh_user"
set :svn_user, "your_svn_user"
#Set the full path to your application on the server
set :deploy_to, "/PATH/TO/YOUR/#{application}"

#Define your servers
role :app, "your.appserver.com"
role :web, "your.webserver.com"
role :db, "your.databaseserver.com", :primary => true

desc "Link in the production extras and Migrate the Database ;)"
task :after_update_code do
  run "ln -nfs #{shared_path}/config/database.yml #{release_path}/config/database.yml"
  run "ln -nfs #{shared_path}/config/merb.yml #{release_path}/config/merb.yml"
  run "ln -nfs #{shared_path}/log #{release_path}/log"
  #if you use ActiveRecord, migrate the DB
  #deploy.migrate
end

desc "Merb it up with"
deploy.task :restart do
  run "cd #{current_path};./script/stop_merb"
  run "cd #{current_path};env EVENT=1 merb -c 3"
# If you want to run standard mongrel use this:
# run "cd #{current_path};merb -c 4"
end

#Overwrite the default deploy.migrate as it calls: 
#rake RAILS_ENV=production db:migrate
#desc "MIGRATE THE DB! ActiveRecord with Merb Love"
#deploy.task :migrate do
#  run "cd #{release_path}; rake db:migrate MERB_ENV=production"
#end

Use Capistrano to initiate the environment, setting up the necessary directories on the server.

$ cap deploy:setup

Next, install the Gems you need on the Production server.

#Ensure you have the latest version of the gem system
sudo gem update --system

sudo gem install merb
sudo gem install rspec

# For ActiveRecord
sudo gem install merb_activerecord

# For DataMapper
sudo gem install datamapper
sudo gem install do_mysql

# For Evented Mongrel
sudo gem install swiftiply

# For Standard Mongrel
sudo gem install mongrel

# For Thin
sudo gem install thin

Create the directories and files that will be linked in.

mkdir /YOURDEPLOYPATH/shared/config
touch /YOURDEPLOYPATH/shared/config/database.yml
touch /YOURDEPLOYPATH/shared/config/merb.yml

Edit the .yml files to your liking and then be sure to create your database in MySQL.

Deploy your app:

cap deploy

You should now have your application deployed to your server with 3 Mongrel instances running being proxied to by Nginx.

Advanced Recipes

Introduction

So you've come this far and are feeling pretty confident about your Merb abilities. This chapter is all about sharing some top tips and code examples that you may find to be time savers in real life situations.

Contributing

Most of the examples here are taken from real world projects or blog posts. We are always looking for contributions so if you have something to share let us know.

Loading The Merb Environment Outside Of Merb

Sometimes you will need to load up a Merb environment (and it's frozen gems) for things such as RSpec stories or cron tasks. This little snippet does just that.

env = ENV['MERB_ENV'] || 'test'

require 'rubygems'
Gem.clear_paths
Gem.path.unshift(File.join(File.dirname(__FILE__), "gems"))

require 'merb-core'
Merb.load_dependencies(:environment => env)

require 'spec'
Merb.start_environment(:testing => true, :adapter => 'runner', :environment => env)

Here we have assumed that the script is running from the root of your app. In practice though you will most likely want it in a different location so be sure to adjust the Gem.path accordingly.

Freezing Gems

This recipe was contributed by Michael Klishin

One thing Merb community gets right is gems bundling. config/init.rb in Merb apps has the following magic line that shows idea of independency of the application from environment it runs in is baked into the core:

Gem.path.unshift(Merb.root / "gems")

Yay, no need to reinvent Gem plugin. It is that simple. A note here: to actually get up and running with Merb and Edge ActiveSupport and ActiveRecord bundled under /gems directory you have to specify installation directory with -i option:

sudo gem install -i ~/dev/workspace/some-merb-application/gems

This sets up a directory structure RubyGems' custom require expects to see.

Another thing Merb community gets right is freezing of Merb itself. I want to run this app on Edge Merb, I want to be independent from what Merb gems are on the box I deploy to. Rails does it by exporting a tarball since it moved to Git so you absolutely cannot track the tree you currently use.

In Merb there is a nice plugin merb-freezer. What it does is using either gems unpack strategy or Git submodules strategy if you use Git for your Merb application. This is very cool. Git submodules is like Subversion's externals but adapted to distributed nature of Git and packed with features Subversion lacks.

With git modules freezing you can track what commit hash app is frozen to, what recent log messages say, update it one by one or all at once, use the branch you want from repository as a submodule, see meaningful submodules state summary. Compare this to tarballs management.

To use merb freezer all you have to do is to install merb-freezer from merb-more repo and include a line

require 'merb-freezer'

into your config/init.rb. Then run

rake freeze:core

if you want to use Git submodules or

MODE=gems rake freeze:core

if you want to go with installed gems.

freeze:more and freeze:plugins do freezes of merb-more and merb-plugins, respectively.

If you choose submodules, make sure you start with a clean branch. Submodules meta information file (.gitmodules) and frameworks directory where Merb is frozen to have to be commited after run of Rake task that does the freeze.

To update Merb use the same Rake task with UPDATE env variable set to true. To see what commits application is frozen to, use

git submodule status

To see N recent commits in Merb core installed as a submodule use

git submodule summary -n <N> frameworks/merb-core

Using X-Accel-Redirect and nginx_send_file

In your usual merb application, most of your assets (images, stylesheets etc.) can cheerfully sit in the public folder of the application, ready to be served up quickly and efficiently by nginx (or whatever webserver you're using) without having to trouble mongrel or thin or whatever is running the merb part of the application. There's no need for merb to see the request, because they're public files, ready to be given to everybody.

Right?

Most of the time, that's fine, but sometimes, you want to protect some premium content. Perhaps premium users get more themes, or perhaps you have to register before you can see some images or download pdfs. Now requests like those are going to have to hit merb, so that you can check the user has sufficient privileges to do so. These might be as simple as 'is a user' or as complex as you can imagine them to be.

The Simple Way

Just use send_file. I'm assuming in this example there are some pdf reports, which only appropriately authenticated users can access, which is what the before filter does. All the important stuff happens in the download_report action, so I'm ignoring the rest of the controller. I'm also assuming the appropriate MIME type for pdfs has been set up.

class Reports < Application

  before :check_authentication, :only => :download_report

  def download_report
    only_provides :pdf

    @report = Report.first(:name => params[:name])
    send_file(@report.file_path, :type => 'application/pdf')
  end
end

Assuming that full_path returns the full path to the report file, perhaps located in the private/reports subdirectory of the merb application (NOT in the public folder) then the file will be sent to the client with the appropriate headers and they'll download it. Since you're probably using mongrel or thin, this won't actually tie the whole application up, it will be run in a separate thread or EventMachine connection. Everything's good, right?

Well, not entirely. First, there's a small delay whilst the file is read into memory. Second, there's the fact that the file is read into memory and has to remain there whilst it's sent. Not so much of an issue for a 50kb image, but when you start to get a couple of dozen users wanting to download a 10 or 20mb pdf report, all that memory usage starts to become quite noticeable.

X-Accel-Redirect

This is where nginx comes in. Nginx offers a facility which it calls X-Accel-Redirect, which works quite simply. When appropriate headers are set by the application, nginx will read files outside the normal web root, and send them to the user.

There are two parts to this. The first is a section in the nginx config file.

location /private/ {
    internal; # only the server can make requests here, a client will get a 404
    alias /path/to/merb/root/private/; # note trailing slash
}

This snippet goes inside the nginx server block. internal is the directive which gives us our protection, since a direct request to the url will just give a 404 error, not serve the file.

For the second part, we re-write our action from earlier:

def download_report
  only_provides :pdf

  @report = Report.first(:name => params[:name])
  headers['Content-Type'] = ''
  nginx_send_file(@report.file_path)
end

We have to (at the time of writing) set the content type to blank, to let nginx set it appropriately, although we could also set it accurately ourselves. All that's left is to make sure the file path is correct. This path should begin with /private/ and should be the path of the file in the private directory. Something like /private/reports/secret_report.pdf. The nginxsendfile method takes care of setting the X-Accel-Redirect header appropriately.

Assuming the file exists, nginx will send it to the client, without merb needing to load it into memory and with nginx's trademark speed and efficiency.

Plugins

(TODO) - Go over some cool merb plugins