Posted by & filed under Databases, NoSQL.

does anyone still remember, that books once came on paper? And they were heavy, usually never under 500pages and had intimidating titles that usually included the words “professional”, “complete guide”, or even “bible”. And they made you walk through the topic from the beginning of time, starting with Adam&Eve, not leaving out the abacus and Babbage’s difference engine – I hardly remember any more how many introductions to object-oriented computing I had to skip in my life…

Luckily, these days are gone for good: The book that comes on paper is almost dead – at least, when it comes to books on computing. The ebook opened up the market for a whole new breed of publications: Short and very focused papers on very specific topics. These papers usually weigh in with rarely more than 200 pages, deal with one topic and one topic only and can be safely digested in about an hour of your precious: Learning Mongoid is a very good example for this kind of “books”.

Mongoid it is

The new publication from Packt Publishing: “Learning Mongoid” by Gautam Rege does exactly what it says: It shows you how to use mongoid. When you pick up this short paper, you should already have your mind set on:

  • ruby&rails is good for you
  • mongoDB is good for you

The book does not help you with these decisions – which is a good thing (see above)- but it helps you, when it comes to dealing with the one-and-only mapping-framework for mongodb that is still standing.

Although Mongoid‘s documentation has vastly improved in the last year or so, (which is probably one of the main reasons, barely anyone is using mongo-mapper anymore) – scrapping together decent examples for using it still takes some digging. So the book fills the – albeit small – gap between mongoid’s documentation and the myriad of tiny snippets you can dig up on stack-overflow. It is a small book (140pages), but it gives you examples on just about every piece of mongoid’ api along with helpful examples that show you how to use these api’s.

The writing is clear and to the point – boxes point out common pitfalls&things you should look out for when using mongoid. Unlike other books of this nature, the examples do not lead to a complete application: instead, the examples are focused on demonstrating the specific part of the api and do a good job of showing you how to apply concept to a real-world(-ish) example.

The bottom line

I liked it, that the book is short: it touches on every part of the api, giving you a good overview of mongoid: This is often very helpful, when you already know that things like the Paranoia-plugin exist and what their purpose is – when it comes to actually using is, you can still find the specifics online, now that you know what you are looking for.

That being said, the shortness comes at a price:

  • some topics come in a little too short: There is probably more about scaling a mongoDB than just getting the formatting of your data-disk right(although it’s a start) – but then again, this is no book about mongodb. But maybe if you touch on the topic this light, you can leave it out altogether just as good…
  • the book shows every piece of the api, but it doesn’t help you much with choosing which pattern to use when. Mapping your domain onto a mongoid/mongodb model is not entirely straightforward and there are many ways you can model the same domain with different approaches – and this can be a bit challenging at first, especially when you come from a RDBMS-background. But again: This is not about modeling mongoDB-databases, this book is about mongoid-and-mongoid-only.
  • some examples are very terse and you can probably argue, that you can find these kind of examples on stack-overflow and combine it with the mongoid-docs (which are not soo bad to begin with): But having all this stuff together in one place is certainly helpful when you start.

All in all, I liked the book: Even if you’ve been doing mongoid for some time, you will probably find the one or other thing in it you didn’t know yet about mongoid. And if you’re just starting with mongoid and you’re looking for a decent introduction to the whole api, this is certainly the book to go for.

Posted by & filed under Rails, Testing.

Part I shows an implementation to send out huge numbers of notification-mails asynchronously using sidekiq: This solves the problem of clogging the requests with a huge number of mail-sends: But once you have made everything asynchronously – how do you test it?

Here is the problem: Work is sent to redis via sidekiq, the (external) sidekiq-process picks it up and does something with it outside of your usual request-response-cycle – how do you test that?

The answer is quite simple: You don’t. That is: run things asynchronously. The gem rspec-sidekiq does just that: Instead of actually sending the jobs to sidekiq for processing, it enqueues these jobs in a internal job-array that you can inspect from within your tests. It even gives you handy expectations to write your specs:

1
2
3
4
5
expect(NotificationsWorker).to have_enqueued_jobs(1) # new expect syntax
# ...but you could just use
expect(NotificationsWorker).to have(1).jobs
# ...or even
expect(NotificationsWorker).to have(1).enqueued.jobs

check out the documentation for rspec-sidekiq for the whole awesomeness.

So, how do we test out asynchronous mailer then? First thing we have to test is, that the job to kick of all the mail-sending jobs is created:

 
it "should kick of a Notification-Job" do 
   observed_model   = FactoryGirl.create(:interesting_model, state: "inactive")
   interested_user  = FactoryGirl.create(:user, receive_notifications: true)
 
   observed_model.activate!
 
   expect(NotificationsWorker).to have_enqueued_jobs(1) 
end

Up to this point, no work is done: The notifications has just been enqueued, but not executed. Next step is to actually do something – this is where .drain comes into play: it inspects the queue and executes everything it finds in there. The notifications-worker itself does nothing at all – it only enqueues mail-jobs, the actual sending again will be done asynchronously. You now NotificationsWorker.drain this queue and expect to generate the mail-jobs. All that’s left to do: Find out the name of the mail-worker Sidekiq::Extensions::DelayedMailer – which you can then inspect:

it "should kick of a Notification-Job and " do 
   observed_model   = FactoryGirl.create(:interesting_model, state: "inactive")
   interested_user  = FactoryGirl.create(:user, receive_notifications: true)
 
   observed_model.activate!
 
   expect(NotificationsWorker).to have_enqueued_jobs(1) 
 
   expect{
        # forces the execution of all enqueued jobs in this queue
        NotificationsWorker.drain
      }.to change{Sidekiq::Extensions::DelayedMailer.jobs.size}.by 1
end

Still: No mails are send out at this point – you need to drain the mail-worker-queue, which does the actual sending – and you finally end up at the usual means of testing emails:

it "should notify interested users of state-changes" do 
   observed_model   = FactoryGirl.create(:interesting_model, state: "inactive")
   interested_user  = FactoryGirl.create(:user, receive_notifications: true)
 
   observed_model.activate!
 
   expect(NotificationsWorker).to have_enqueued_jobs(1) 
 
   expect{
        # forces the execution of all enqueued jobs in this queue
        NotificationsWorker.drain
   }.to change{Sidekiq::Extensions::DelayedMailer.jobs.size}.by 1
 
   expect{
        Sidekiq::Extensions::DelayedMailer.drain
   }.to change{ActionMailer::Base.deliveries.count}.by 5
end

Asynchronous scenarios tend to be a little more complex than simply running your code inline: Fortunately, the rspec-sidekiq, which basically allows you to flattens all this asynchronicity takes away much of the pain that comes with testing these scenarios. The above bulk-notification-scenario should give you a good headstart for this.

Posted by & filed under Rails.

Even if you are not a spammer, there comes the time when you need to send out really many mails: Something has changed in your system and you have really many subscribers that (really!) want to be notified about these changes: Don’t do this yourself, let your sidekiq do it. In this post we’ll show you the implementation, the follow-up post will show you how to properly test these more complex, asynchronous scenarios.

Here is the (rather common) setup: We have a model with a statemachine and we users that are intested in these changes: whenever the state of our model changes, we want to send out emails to them. If you only have a few interested users, this is quite simple:

1
2
3
4
5
after_transition do |event, transition|
   Users.interested_in(model).each do |user|
	   NotificationMailer.state_changed_email(user, model).deliver
   end
end

where interested_in is a scope on the users-model. If you only have a few interested users, you can probably get away with that. The problem starts, when you have significantly more than a few, as this will block your request: Sending out emails can take a while and if you have lot’s of sending to do, you will probably even hit a timeout.

The first fix for this we encountered on stackoverflow was the idea to use multi-threading:

1
2
3
4
5
6
7
8
threads = []
 
Users.interested_in(model).each do |user|
   threads << Thread.new do
     	   NotificationMailer.state_changed_email(user, model).deliver
   end
end
threads.each(&:join)

But: If you actually have 500 interested users, do you really want to open 500 threads? What about 5000? 500.000? And even if you open up these threads, you still need to the actual sending inside your request and wait until all the threads have come back.

So the next idea here is to get the actual sending out of the request: here is where sidekiq comes into play. Sidekiq already comes with an extension that allows you to delay the actual sending and delegate it to sidekiq:

1
2
3
4
5
after_transition do |event, transition|
  User.where(notify: true).each do |user|
      NotifyMailer.delay.state_changed_email(event.id,user.email)
  end
end

This farms out the actual sending to sidekiq: under the sidekiq basically does the same as the threading code above – it uses threads to parallelize the work – but with a significant difference: It has a (configurable) limit to the number of worker-threads. Even if you send out 500.000 emails, there will only be 25 threads to do that. Which gives you a much better balance.

There is only one thing left: The actual mail-sending is out of the way, but if you’re sending out many notifications, even kiquing of workers (which is basically a publish to a redis-channel) can take some time: So let’s use a worker to kiq the other workers:

1
2
3
4
5
6
7
8
9
class EventNotifier
  include Sidekiq::Worker
 
  def perform(event_id)
    User.where(notify: true).each do |user|
       NotifyMailer.delay.state_changed_email(event_id,user.email)
    end
  end
end

And instead of starting the jobs directly, kiq this worker who in turn will then kiq off the mail-workers:

1
2
3
after_transition do |event, transition|
  EventNotifier.perform_async(event.id)
end

This call comes back immediately and is not dependent on the number of notifications you want to send out: Most importantly, this takes the actual sending out of the request and prevents time-outs.

There are of course other options for this: mailchimp or sendgrid offer api’s for doing these kindes of bulk-notifications: But this is a more general approach that you can use to get stuff out of your request-cycle. In part II of these series, we’ll make sure that this approach is properly tested.