Optimizing Sidekiq For Maximum CPU Performance on Multicore System

I have been working a lot with somewhat large datasets (millions of records) that benefits from parallel processing. I thought sidekiq’s multi-threading was going to be a great solution for this, but upon further investigation, I noticed my work was only marginally faster and that my CPU wasn’t ever at 100%. In fact, it was hovering more around 25%… what gives? Maybe my jobs are IO bound? Nope, that wasn’t the case… $ top showed wait cpu time to be 0.0.. The CPU wasn’t waiting for more IO! What could be the issue?

Global Interpreter Lock (GIL) Sadness

On further research, I learned that all MRI ruby threads run one at a time, even on a multi-core system! This is to protect from non-thread safe functions. Implementations of JRuby and Rubinius have threads that can run in parallel, but I didn’t have a chance to try them. Reading this toptal article was very informative for me to understand the difference between ruby concurrency and parallelism. (Sorry for referencing parallel wrong in previous blogs!)

Solution For Maxing Out CPU in Sidekiq

So the only way to max out cpu utilization with sidekiq is to use more processes. All you have to do is spawn up more sidekiq workers with the same configuration file and they will just be added to the pool of workers. Neat and simple! You have to note that more worker processes mean more memory. While worker threads can share memory, worker processes will not and if you spawn too many processes, you’ll run out of memory quickly. Also, in general, it is better to only spawn as many workers as you have logical CPUs.

I wrote a quick script to manage starting/stopping sidekiq workers. Feel free to use it too if you’d like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#!/bin/bash
 
NUM_WORKERS=2
NUM_PROCESSES=4
 
# http://www.ostricher.com/2014/10/the-right-way-to-get-the-directory-of-a-bash-script/
get_script_dir () {
     SOURCE="${BASH_SOURCE[0]}"
     # While $SOURCE is a symlink, resolve it
     while [ -h "$SOURCE" ]; do
          DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
          SOURCE="$( readlink "$SOURCE" )"
          # If $SOURCE was a relative symlink (so no "/" as prefix, need to resolve it relative to the symlink base directory
          [[ $SOURCE != /* ]] && SOURCE="$DIR/$SOURCE"
     done
     DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
     echo "$DIR"
}
 
start_sidekiq_workers() {
  echo "Starting $NUM_PROCESSES sidekiq procesess with $NUM_WORKERS each."
  for n in `eval echo {1..$NUM_PROCESSES}`; do
    bundle exec sidekiq -r "$(get_script_dir)/../config/environment.rb" -c $NUM_WORKERS &
  done
}
 
case $1 in
  stop)
  ps aux|grep "sidekiq 3"|grep -v grep|awk '{print $2}'|xargs kill
  ;;
  start)
  start_sidekiq_workers
  ;;
  status)
  ps aux|grep "sidekiq 3"|grep -v grep
  ;;
  *)
  start_sidekiq_workers
  ;;
esac

Results

So after playing around with worker threads and processes. Here is the results of the job I was working on with different parameters:

Completed importing all files in 01:39:48:119774276. – 25 workers, 1 processes
Completed importing all files in 00:44:06:214396540. – 10 workers, 2 processes
Completed importing all files in 00:28:21:940166878. – 5 workers, 4 processes
Completed importing all files in 00:17:51:737359697. – 4 workers, 4 processes
Completed importing all files in 00:11:04:804641568. – 2 workers, 8 processes
Completed importing all files in 00:09:59:336971420. – 1 worker, 16 processes

Clearly, using more processes is faster than just more worker threads. Just make sure you have enough memory! For my test run, I could only divide up my work into 16 jobs, so I couldn’t test with more processes… but I think at a certain point, adding more processes would not help make the job run any faster and would probably start slowing down the system with overhead. I recommend running benchmarks on a small subset of your data to determine what the right balance would be before processing the whole thing! You can save a lot of time if you can make a 20 hour job turn into a 2 hour job.

I would love to see how multi-threaded processes work with rubinius. It’s been pretty fun learning about concurrency and parallel computing in ruby context.

MySQL – Processing 8.5 Million Rows In a Reasonable Amount of Time

I had to crunch through a database of approximately 8.5 million records computing a hash, validating a few fields, and then updating a column with the results of the hash. Because I needed to compute the hash, I couldn’t just use an UPDATE statement to work on all the rows – I had to read and update each of the 8.5 million rows in my script. Sounds painful already!

On my initial attempt at SELECT the record, calculate the hash, and UPDATE, I was able to do about 150 rows per second… Let’s see, that’ll take a 20 hours… We can do better than that… :)

Skip to TL;DR

Parallelization & Optimization

The first thing that I wanted to optimize is the use of all 8 logical cores (Core I7 with hyper threading). That’s why I chose to use Sidekiq. We could now have multiple workers crunching different chunks of data and hopefully saturate both IO and CPU. I used Boson to make my app a simple command line application, but I could have easily used rake tasks too.

Here is pass #1 at this task:

Command Line To Start Worker:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/env ruby
 
require File.expand_path( '../../config/environment', __FILE__)
 
require 'boson/runner'
require 'sidekiq'
require 'workers/process_worker'
 
class ProcessRunner < Boson::Runner
  def process( num_workers = 10 )
    num_workers=num_workers.to_i
    puts 'spawning workers'
 
    Sidekiq.redis {|conn| conn.set('timer', Time.now.to_f) }
    num_workers.times do | worker_number |
      ProcessWorker.perform_async worker_number, 100
    end
  end
end
 
ProcessRunner.start

Worker:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
require 'sidekiq'
require 'sequel'
 
class ProcessWorker
  include Sidekiq::Worker
 
  def perform(worker_number, chunk_size)
    DB.transaction do
      dataset = DB[:work_table].select_all(:work_table).left_outer_join(:output_table, :fk_id => :id).where(:result => nil).limit(chunk_size,worker_number*chunk_size)
      output_table = DB[:output_table]
      dataset.each do |row|
        result = # Process the row data here
 
        output_table.insert(:result => result, :fk_id => row[:id])
      end
 
      if dataset.count > 0
        puts "spawn worker to do more work again."
        ProcessWorker.perform_async worker_number, chunk_size
      else
        start_time=Time.at(Sidekiq.redis {|conn| conn.get('timer') })
        end_time = Time.now
 
        puts "Worker ##{worker_number} reports job took #{(end_time - start_time)*1000} milliseconds"
      end
    end
  end
end

My strategy here was to have a secondary table (output_table) that I would dump my results to and join it to the primary table work_table. With my outer left join, I would search for rows in the work_table that had null values in the results field indicating that it had not been processed yet. I tried out using Sequel for this project because I never used it before and thought it’d be nice being able to have a ruby model way of accessing the data. It turned out sorta nice. The data was divided into different chunks among the workers. Each chunk was chunk_size large (100 in the example)… So it would look something like this:

--------------------------------------
| row 1-99    | chunk 0 for worker 0 |
| row 100-199 | chunk 1 for worker 1 |
| row 200-299 | chunk 2 for worker 2 |
| row 300-399 | chunk 3 for worker 3 |
| row 400-499 | chunk 4 for worker 4 |
| ...         | ..                   |
--------------------------------------

This algorithm and implementation did not turn out nice though! Because I was SELECT-ing data and processing it non-atomically; if, for instance, worker 1 finished rows 100-199 and asked for another set of data, it would now be working on rows 200-299 at the same time as worker 2 that was supposed to work on the data…

After processing row 100-199, the new chunk 1 was row 200-299:

--------------------------------------
| row 1-99    | chunk 0 for worker 0 |
| row 100-199 | COMPLETED            | (not returned to SELECT query)
| row 200-299 | chunk 1 for worker 1 | <- now both worker 1 and worker 2
| row 300-399 | chunk 2 for worker 2 |    is working on these rows
| row 400-499 | chunk 3 for worker 3 |
| ...         | ..                   |
--------------------------------------

Bad bad bad race conditions… Inefficient processing of data… still only getting about 300 rows processed per second… 10 hours still too long for the job… back to Google, StackOverflow, and the MySQL manual…

MySQL Bulk Data Recommendations (LOAD DATA INFILE)

The MySQL documentation has a lot of great tips for optimizing bulk inserts, but the most useful I found was this:

When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using INSERT statements. See Section 13.2.6, “LOAD DATA INFILE Syntax”.

Speed of INSERT Statements

I’ll take a 20x speed up! Let’s rewrite the code to take advantage of INFILE… I’ll be using tmp csv files.

Job Invoker:

10
11
12
13
14
15
16
17
18
  def process( num_workers = 10 )
    Sidekiq.redis {|conn| conn.set('process_workers', num_workers.to_i) }
    puts 'spawning workers'
 
    Sidekiq.redis {|conn| conn.set('timer', Time.now.to_f) }
    num_workers.times do | worker_number |
      ProcessWorker.perform_async worker_number, 1000
    end
  end

Worker:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
require 'sidekiq'
require 'sequel'
require 'sequel/load_data_infile'
 
class ProcessWorker
  include Sidekiq::Worker
 
  def total_num_of_workers
    Sidekiq.redis {|conn| conn.get('process_workers') }
  end
 
  def perform(worker_number, chunk_size, current_count = 0)
    file_data = File.open("/tmp/worker_#{worker_number}_output#{current_count}.txt", 'w')
 
    dataset = DB[:work_table].select_all(:work_table).left_outer_join(:output_table, :fk_id =&gt; :id).limit(chunk_size,total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size)
    output_table = DB[:output_table]
    dataset.each do |row|
      result = # Process the row data here
 
      file_data.puts "#{row[:id]},#{result}"
    end
 
    file_data.close
 
    output_table.load_csv_infile("/tmp/worker_#{worker_number}_output#{current_count}.txt", [ :fk_id, :result ])
 
 
    ## TODO: it'd be nice if we clean up the tmp directory when we're done.
 
    if dataset.count > 0
      puts "spawn worker to do more work again."
 
      ProcessWorker.perform_async worker_number, chunk_size, current_count+1
    else
      start_time_raw=Sidekiq.redis {|conn| conn.get('timer') }
      start_time=Time.at(start_time_raw.to_f)
      end_time = Time.now
 
      puts "Worker ##{worker_number} reports job took #{(end_time - start_time)*1000} milliseconds"
    end
  end
end

I made the following changes with the code:

  • I’m using the sequel load_data_infile gem… love it that there’s a gem for everything. :)
  • I avoided the race condition by using LIMIT and OFFSET to always advance not base the query off of data that is being processed
  • We write the processed batch to CSV file and have mysql import it.

Let’s give it a run…

Worker #1 reports validate job took 2153123.5738263847 milliseconds

Yay! We were finally able to finally process through the full dataset in a reasonable amount of time! 1/2 hr is not too bad for 8.5 million records, right? Actually, we can probably optimize this a bit more…

Scanning By Primary ID instead of LIMIT/OFFSET

I noticed that the workers that was processing data at the beginning of the table was returning back fast – 1-5 seconds and as we reached the end of the table, around the 4 million mark, it got extremely slow and the CPU would start working really hard.

I did a little digging and I found the issue is that LIMIT/OFFSET has to query the ENTIRE dataset up to the point where you need and then it discards all the data at the beginning and returns you the amount of data that you want. In other words, mysql was going through 4,001,000 records to give me record 4,000,000-4,01,000. It was going through 4,002,000 records to give me the next chunk and so forth.. No wonder the query was getting slower and slower!

Let’s fix that:

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
  def perform(worker_number, chunk_size, current_count = 0)
    file_data = File.open("/tmp/worker_#{worker_number}_output#{current_count}.txt", 'w')
 
 
    low = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size
    high = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size+chunk_size
    dataset = DB[:work_table].select_all(:keys).left_outer_join(:output_table, :key_id => :id).where{(Sequel.qualify(:work_table,:id) >= low)}.where{Sequel.qualify(:work_table,:id) < high}
    # debugging
    puts dataset.sql
    output_table = DB[:output_table]
    dataset.each do |row|
      result = # Process the row data here
 
      file_data.puts "#{row[:id]},#{result}"
    end
 
    file_data.close
 
    tmp_keys.load_csv_infile("/tmp/worker_#{worker_number}_output#{current_count}.txt", [ :key_id, :data ])
 
 
    ## TODO: it'd be nice if we clean up the tmp directory when we're done.
 
    last_auto_increment_id = DB["SELECT `AUTO_INCREMENT` FROM  INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME = 'work_table'"].first[:AUTO_INCREMENT]
    if total_num_of_workers.to_i*chunk_size*current_count < last_auto_increment_id
      puts "spawn worker to do more work again."
 
      ProcessWorker.perform_async worker_number, chunk_size, current_count+1
    else
      start_time_raw=Sidekiq.redis {|conn| conn.get('validate_timer') }
      start_time=Time.at(start_time_raw.to_f)
      end_time = Time.now
 
      puts "Worker ##{worker_number} reports validate job took #{(end_time - start_time)*1000} milliseconds"
    end
  end
end

We are now scanning the table by primary ID from 1 to the AUTO_INCREMENT counter, so we are guaranteed to get all the rows. This SELECT method worked fast! If there was no data, the query returned back almost instantaneously. I had some gaps in my table from deletes, so it was really critical that this query returned back fast if it had no results. Overall, I probably lost a few trivial seconds skipping over deleted IDs. Let’s run our benchmark again:

Worker #9 reports validate job took 208492.26823838233 milliseconds

Wow! We’re able to process through 8.5 million records within FOUR MINUTES! That is certainly fast enough for me to be working with this database on a regular basis.

Applying What We Learned to UPDATEs

It is unfortunate that we can’t use LOAD FILE INLINE for updates, but let us see if we can apply the primary key scan and parallelization techniques for UPDATE-ing our existing output_table that we created.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
require 'sidekiq'
require 'sequel'
 
class ProcessUpdateWorker
  include Sidekiq::Worker
 
  def total_num_of_workers
    Sidekiq.redis {|conn| conn.get('process_update_workers') }
  end
 
 
  def perform(worker_number, chunk_size, current_count = 0)
    DB.run('SET autocommit=0;')
    DB.run('SET unique_checks=0;')
    DB.run('SET foreign_key_checks=0;')
 
 
    DB.transaction do
      low = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size
      high = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size+chunk_size
 
      dataset = DB[:work_table].select_all(:work_table).select_append(:data).left_outer_join(:output_table, :key_id => :id).where(:result2 => nil).where{(Sequel.qualify(:keys,:id) >= low)}.where{Sequel.qualify(:keys,:id) < high}
 
      output_table = DB[:output_table]
      puts dataset.sql
      dataset.each do |row|
        result2 = # process row here
        output_table.where("fk_id= ?", row[:id]).update(:result2 => "#{result2}")
      end
 
      last_auto_increment_id = DB["SELECT `AUTO_INCREMENT` FROM  INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME = 'work_table'"].first[:AUTO_INCREMENT]
      if total_num_of_workers.to_i*chunk_size*current_count < last_auto_increment_id
        puts "spawn worker to do more work again."
        ProcessUpdateWorker.perform_async worker_number, chunk_size, current_count+1
      else
        start_time=Time.at(Sidekiq.redis {|conn| conn.get('validate_timer') }.to_f)
        end_time = Time.now
 
        puts "Worker ##{worker_number} reports job took #{(end_time - start_time)*1000} milliseconds"
      end
    end
 
    DB.run('SET autocommit=0;')
    DB.run('SET unique_checks=0;')
    DB.run('SET foreign_key_checks=0;')
 
  end
end

For this job, I added the recommended turning auto commit, unique_checks, and foreign_key_checks off, but with my database, I don’t think I saw any improvements. I don’t think I really was using unique or foreign keys too much.

Anyways, after running this job, I was able to make 8.5 million updates in 26 minutes:

Worker #6 reports validate job took 1557370.9786546987 milliseconds

Not that bad, but could we make this faster??

LOAD DATA INFILE to temp table + UPDATE by JOIN

Doing more research, I saw that you can UPDATE one column from a table from another table via a join. Gave me the idea about loading the column I wanted to change into a tmp table and then overwriting the column I want to update. Would it work?

Job Invoker:

20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  def bulk_insert_then_update( num_workers = 10)
    Sidekiq.redis {|conn| conn.set('bulk_insert_then_update_num_workers', num_workers.to_i)}
    Sidekiq.redis {|conn| conn.del('worker_complete_count')}
    puts 'starting bulk insert than update job'
 
    Sidekiq.redis {|conn| conn.set('bulk_insert_then_update_timer', Time.now.to_f) }
    DB.create_table! :tmp_table do
      primary_key :id
      foreign_key :fk_id, :keys
      String :result2
    end
    num_workers.times do | worker_number |
      BulkInsertThanUpdate.perform_async worker_number, 20000
    end
  end

Worker:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
require 'sidekiq'
require 'sequel'
 
class BulkInsertThanUpdate
  include Sidekiq::Worker
 
  def total_num_of_workers
    Sidekiq.redis {|conn| conn.get('bulk_insert_then_update_num_workers') }
  end
 
  def perform(worker_number, chunk_size, current_count = 0)
    file_data = File.open("/tmp/worker_#{worker_number}_output#{current_count}.txt", 'w')
 
    low = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size
    high = total_num_of_workers.to_i*chunk_size*current_count+worker_number*chunk_size+chunk_size
    dataset = DB[:work_table].select_all(:keys).select_append(Sequel.qualify(:output_table,:result)).left_outer_join(:output_table, :key_id => :id).where(Sequel.qualify(:output_table,:result2) => nil).where{(Sequel.qualify(:work_table,:id) >= low)}.where{Sequel.qualify(:work_table,:id) < high}
    puts dataset.sql
    tmp_table = DB[:tmp_table]
    dataset.each do |row|
      result2 = # calculate results from row here
      file_data.puts "#{row[:id]},#{result2}"
    end
 
    file_data.close
 
    tmp_table.load_csv_infile("/tmp/worker_#{worker_number}_output#{current_count}.txt", [ :key_id, :result2 ])
 
 
    ## TODO: it'd be nice if we clean up the tmp directory when we're done.
 
    last_auto_increment_id = DB["SELECT `AUTO_INCREMENT` FROM  INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'my_db' AND TABLE_NAME = 'work_table'"].first[:AUTO_INCREMENT]
    if total_num_of_workers.to_i*chunk_size*current_count < last_auto_increment_id
      puts "spawn worker to do more work again."
 
      BulkInsertThanUpdate.perform_async worker_number, chunk_size, current_count+1
    else
      complete_worker_count = Sidekiq.redis {|conn| conn.incr('worker_complete_count') }
 
      if complete_worker_count.to_i == total_num_of_workers.to_i
        puts "all workers complete - running bulk sql update command..."
        DB.run('UPDATE output_table,tmp_table SET output_table.result2 = tmp_table.result2 WHERE output_table.fk_id = tmp_table.fk_id;')
        DB.run('DROP TABLE tmp_table')
      end
      start_time_raw=Sidekiq.redis {|conn| conn.get('bulk_insert_then_update_timer') }
      start_time=Time.at(start_time_raw.to_f)
      end_time = Time.now
 
      puts "Worker ##{worker_number} reports validate job took #{(end_time - start_time)*1000} milliseconds"
    end
  end
end

My strategy here was to do the exact same thing as LOAD DATA INFILE, then when all the workers were done load the data from the tmp_table into the output_table. I used the last worker that finished running to run the query by keeping a count of all the workers complete.

What’s the runtime?

Worker #7 reports validate job took 394662.35620292247 milliseconds

We got updates down to under 7 minutes!

Final Optimization Notes

To tune the system, you want to be paying close attention to the cpu time and the iowait time which you can see with top. If the iowait time is high, consider lowering the chunk size to work on smaller chunks at a time. I was able to get my iowait time to stay under 5%. If the CPU isn’t fully loaded, feel free to up the number of workers.

For my system:

i7 3770 3.4Ghz
16GB RAM
256GB Samsung 850 Pro
Mysql 5.5

I ended up with 20 workers and a 20000 chunk size (rows selected at once). When I tried increasing the number of workers, it actually had a negative effect on the benchmark speed, so there is a max that is beneficial. If you have more RAM, I would also consider tuning mysql server and even trying to have the whole DB buffered in memory (innodb_buffer_pool_size). Mysqltuner is a good resource as well as checking dba.stackexchange.com.

TL;DR

  • Parallelize the work with sidekiq – make full use of modern multi-core PCs!
  • Use LOAD DATA INFILE over INSERT statements
  • Scan table by the primary key with WHERE id BETWEEN start AND end rather than use LIMIT/OFFSET
  • Large updates can be made by loading data to a tmp_table with LOAD DATA INFILE and then updating the column via a join with the tmp_table
  • Watch your CPU usage and IOWAIT time and optimize such that your CPU utilization is high and your IOWAIT time is low.

Finally:

8.5 million rows can be SELECT’d and INSERT’d within 3.5 minutes. 1.5 minutes.
8.5 million rows can be SELECT’d and UPDATE’d within 7 minutes. 4.5 minutes.

**Update** I made it even faster by learning how to truly parallelize sidekiq!

I would love to hear if you know of better techniques that is even faster than this!

RSpec let! and before

‘let’ in rspec allows you to define objects and methods to an instance method. Official docs:

Use let to define a memoized helper method. The value will be cached across
multiple calls in the same example but not across examples.

‘let!’ allows you to define an object that is run in a `before` hook. I was curious to the order of execution of ‘before’ hooks and let, so I ran a simple test:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
context 'when testing before' do
  before :all do
    p 'before :all invoked'
  end
 
  before do
    p 'before :each invoked'
  end
 
  let!(:test) { p 'let! invoked' }
 
  it 'prints debug statements' do
  end
end

The results were:

"before :all invoked"
"before :each invoked"
"let! invoked"

So, it seems the before hooks are run BEFORE the let! hooks. This is useful info to note if you need the let! method to be run before the before block, you’ll need to call it manually yourself like the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
context 'when testing before' do
  before :all do
    p 'before :all invoked'
  end
 
  before do
    test
    p 'before :each invoked'
  end
 
  let!(:test) { p 'let! invoked' }
 
  it 'prints debug statements' do
  end
end

and we get:

"before :all invoked"
"let! invoked"
"before :each invoked"

Amazon Elastic Beanstalk Hooks

I’ve been working a lot with Amazon Elastic Beanstalk lately to scale our application out. I noticed that there were some nice hooks that elastic beanstalk provides in the /opt/elasticbeanstalk/hooks/ directory. I don’t know why, but Amazon never documented these hooks, but they are very useful to setup your Elastic Beanstalk environment before and/or after your application is deployed. I’ve seen pre-deploy scripts setup additional applications or services on each box (local redis cache service) or a post-deploy script to run a rake task after everything is setup (Did I mention I don’t program in PHP anymore?).

One thing I wish Amazon would do is document and lock down this feature because it is very useful. It seems that Amazon doesn’t really want to do that and the forums warn that it is undocumented and the hooks are subject to change. Well, it is still good to note what order the hooks are fired and what actually triggers them, so here’s my attempt to document all that. You can easily see this in eb log or logging into the server via ssh and checking /var/log/eb-activity.log.

Here are the events that are fired from the AMI (64bit Amazon Linux 2015.03 v1.4.1 running Ruby 2.2 (Puma) ) that I am on for ruby:

Deploy New Instance

1
2
3
4
5
6
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/preinit.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/pre.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/enact.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/post.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/postinit.
  Successfully execute hooks in directory /opt/elasticbeanstalk/addons/logpublish/hooks/config.

eb deploy

1
2
3
4
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/pre.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/enact.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/appdeploy/post.
  Successfully execute hooks in directory /opt/elasticbeanstalk/addons/logpublish/hooks/config.

Reconfigure Scaling

No hooks executed

Changing Environment Properties (eb setenv)

1
2
3
4
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/configdeploy/pre.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/configdeploy/enact.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/configdeploy/post.
  Successfully execute hooks in directory /opt/elasticbeanstalk/addons/logpublish/hooks/config.

Restart App Server

1
2
3
4
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/restartappserver/pre.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/restartappserver/enact.
  Successfully execute hooks in directory /opt/elasticbeanstalk/hooks/restartappserver/post.
  Successfully execute hooks in directory /opt/elasticbeanstalk/addons/logpublish/hooks/config.

Did I miss any event? Please let me know in the comments and I can add it to the list.

If you want your app to deploy with files into these directories, elastic beanstalk provides a way for you to place arbitrary files on the instance and run commands on deploy via the .ebextensions config files. You can check out Danne Manne’s blog on how to do this.

It appears each hook directory is run in sorted order, so if you want something to load up before all the other hooks, prepend it with 01* such as 01script. If you want it run after the other scripts, give it a high number such as 99script.

I’m officially a Ruby On Rails Developer Now

I know I don’t post too much to this blog, but you might start noticing that I have stopped posting PHP and I started to post about Ruby and Rails. Why’s that? Well, it’s because my company has switched over to being a RoR shop. Here’s my short opinion on it…

Ruby and the rails community seem much more mature and bleeding edge. How can I put those 2 together? Well, all the stuff I was working on PHP has been stuff that has strongly been influenced or straight out copied from Rails. I think Rails was one of the most popular MVCs that inspired all of the other ones (and yes, I know there were MVCs before Rails). You see, symfony draws its design off rails.

app/console -> rails/rake
doctrine -> activerecord

As rails has been doing it for much longer (rails is on version 4), it has a much more developed community. I find myself finding more gems and having to write less of my own stuff compared to php. Granted, I still make PRs to the ruby/rails community now.

I’m still getting my head around Ruby as a language, but it does have a much more concise way to write than PHP. Blocks are something that definitely has an advantage over other languages. Yes, you can implement something similar in PHP or other languages with anonymous functions or callbacks, but ruby makes it very natural and intuitive. Overall, after getting over the initial hump of learning a new language, I like it.

One thing that has helped me transition easier is having Jetbrains IDE. I moved straight from PHPStorm to Rubymine and I was very happy that almost 100% of the keybindings that I was used to using in PHP moved right over to Rubymine. The ruby debugger seems actually more flexible than the php debugger as I can actually execute arbitrary code within a breakpoint. This is great for inspecting objects and variables. You could probably do this with PHP, but it just wasn’t as simple.

I look forward to the next languages I’d be learning. So far, going from PHP -> Java -> PHP -> Ruby hasn’t been a bad experience for me. Each time, I learned something new and was able to apply principles from one language in another. I think learning about different languages helps broaden your perspective on design patterns as each language lends themselves to writing code in a certain manner.

Why Can’t I Put Debug Statements in Symfony Core?

Just a quick tip for anyone running trying to debug symfony core files, such as \Symfony\Component\HttpFoundation\Request or almost anything in HttpFoundation. Just tried adding a debug print_r() and was wondering why my code was not being executed. There doesn’t seem to be any other place where the Request is defined… I thought maybe it is a cache issue, so I did:

1
$ app/console cache:clear

It didn’t help… It turns out that all these files are cached and lumped into the app/bootstrap.php.cache file and that is only regenerated through a composer install via:

Sensio\\Bundle\\DistributionBundle\\Composer\\ScriptHandler::buildBootstrap

So to summarize, if you want to add debug statements to HttpFoundation classes, you’ll have to edit the bootstrap.php.cache file. Be careful though and don’t mess up your framework! Hope this helps somebody! :)

Online Regex Editor

I don’t know about you, but it isn’t the funnest part of my job when I have to pull out Regular Expressions and make a super long expression to match something… I especially dispise when I have to fix someone else’s Regex!

1
^((([0-9]+)\.([0-9]+)\.([0-9]+)(?:-([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?)(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?)$

You try figuring out what that does!

Well, it doesn’t have to be as painful with the online regex tool I recently found out about:

http://regexr.com/

On this site, you can write regex and sample text that needs to be matched and it will highlight whether it matches or not.

regexScreenshot

As well as get helpful hints to remember what the special characters mean:

regex3

Regex can definitely be confusing as many characters have special meaning based on the context – are we talking about a literal “:” or a ?: that means non-capturing group?

All in all, this tool makes Regular Expressions so easy! They also have a community feature so you can see if someone has uploaded a Regex pattern that you can build off of.

For my project today, I had to build a Regex that matches a semver version string. Here’s the Regex pattern I built with a little help from my coworker – thanks Daniel!

http://regexr.com/39s32

Asserting the Output of Complex Arrays in PHPUnit/PHPStorm

In unit testing, have you ever had assert a really complex array that you didn’t really want to generate the whole expected array yourself? Such as:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
array (
  0 => 
  array (
    0 => 'var 0-0',
    1 => 'var 0-1',
    2 => 'var 0-2',
    3 => 'var 0-3',
    4 => 'var 0-4',
    5 => 'var 0-5',
    6 => 'var 0-6',
    7 => 'var 0-7',
    8 => 'var 0-8',
    9 => 'var 0-9',
  ),
  1 => 
  array (
    0 => 'var 1-0',
    1 => 'var 1-1',
    2 => 'var 1-2',
    3 => 'var 1-3',
    4 => 'var 1-4',
    5 => 'var 1-5',
    6 => 'var 1-6',
    7 => 'var 1-7',
    8 => 'var 1-8',
    9 => 'var 1-9',
  ),
  2 => 
  array (
    0 => 'var 2-0',
    1 => 'var 2-1',
    2 => 'var 2-2',
    3 => 'var 2-3',
    4 => 'var 2-4',
    5 => 'var 2-5',
    6 => 'var 2-6',
    7 => 'var 2-7',
    8 => 'var 2-8',
    9 => 'var 2-9',
  ),
  3 => 
  array (
    0 => 'var 3-0',
    1 => 'var 3-1',
    2 => 'var 3-2',
    3 => 'var 3-3',
    4 => 'var 3-4',
    5 => 'var 3-5',
    6 => 'var 3-6',
    7 => 'var 3-7',
    8 => 'var 3-8',
    9 => 'var 3-9',
  ),
  4 => 
  array (
    0 => 'var 4-0',
    1 => 'var 4-1',
    2 => 'var 4-2',
    3 => 'var 4-3',
    4 => 'var 4-4',
    5 => 'var 4-5',
    6 => 'var 4-6',
    7 => 'var 4-7',
    8 => 'var 4-8',
    9 => 'var 4-9',
  ),
  5 => 
  array (
    0 => 'var 5-0',
    1 => 'var 5-1',
    2 => 'var 5-2',
    3 => 'var 5-3',
    4 => 'var 5-4',
    5 => 'var 5-5',
    6 => 'var 5-6',
    7 => 'var 5-7',
    8 => 'var 5-8',
    9 => 'var 5-9',
  ),
  6 => 
  array (
    0 => 'var 6-0',
    1 => 'var 6-1',
    2 => 'var 6-2',
    3 => 'var 6-3',
    4 => 'var 6-4',
    5 => 'var 6-5',
    6 => 'var 6-6',
    7 => 'var 6-7',
    8 => 'var 6-8',
    9 => 'var 6-9',
  ),
  7 => 
  array (
    0 => 'var 7-0',
    1 => 'var 7-1',
    2 => 'var 7-2',
    3 => 'var 7-3',
    4 => 'var 7-4',
    5 => 'var 7-5',
    6 => 'var 7-6',
    7 => 'var 7-7',
    8 => 'var 7-8',
    9 => 'var 7-9',
  ),
  8 => 
  array (
    0 => 'var 8-0',
    1 => 'var 8-1',
    2 => 'var 8-2',
    3 => 'var 8-3',
    4 => 'var 8-4',
    5 => 'var 8-5',
    6 => 'var 8-6',
    7 => 'var 8-7',
    8 => 'var 8-8',
    9 => 'var 8-9',
  ),
  9 => 
  array (
    0 => 'var 9-0',
    1 => 'var 9-1',
    2 => 'var 9-2',
    3 => 'var 9-3',
    4 => 'var 9-4',
    5 => 'var 9-5',
    6 => 'var 9-6',
    7 => 'var 9-7',
    8 => 'var 9-8',
    9 => 'var 9-9',
  ),
  10 => 
  array (
    0 => 'var 10-0',
    1 => 'var 10-1',
    2 => 'var 10-2',
    3 => 'var 10-3',
    4 => 'var 10-4',
    5 => 'var 10-5',
    6 => 'var 10-6',
    7 => 'var 10-7',
    8 => 'var 10-8',
    9 => 'var 10-9',
  )

(I don’t want to type that up by hand!)

Well, guess what? You can be lazy about it… Simply use var_export and php will generate all the code for you to straight copy and paste.

1
var_export($complexObj);

**DISCLAIMER WARNING** This is generally a VERY bad practice. You are always suppose to create the assertions independently from what the output generates. It will be very easy to just copy and paste a mistake in the output that is not correct. Be sure if you do this that you look very carefully at the output and make sure it is exactly what it is supposed to be.

Lessons Learned in VPN Networking Domain Controllers

I had to setup a 2nd domain controller at an offsite location this past week. I don’t have any good VPN-ing routing equipment so I was just going to use OpenVPN to create a tunnel between the two sites. I set up OpenVPN to work with site-to-site routing and everything seemed to work… I could browse the shares from both sides, everyone was happy and could ping each other. I was even able to successfully install the 2nd domain controller and join the Windows domain. But then, I started getting all these random issues:

I spent lots of time troubleshooting it and I realized the issue. I set up the 2 OpenVPN servers to Masquerade NAT between the 2 subnets so all the traffic looked like it was coming from the VPN server itself. All the RPC calls were failing presumably because the domain controller was trying to open ports to communicate on. Anyways, I fixed that by making the openVPN server properly route and then DFSR was able to properly sync and replicate the two domain controllers. Moral of the story – make sure you set up proper routing and not NAT between domain controllers!

If you were able to get DCs working using masquerading NAT, please let me know. I’d be interested if that was possible.

Removing old Linux Kernels In Ubuntu

Ever get this on linux?

$ df -h /boot/
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 228M 228M 0M 100% /boot

Looks like ubuntu has been updating the kernels without cleaning itself up. Bad ubuntu! And it is terrible that the default boot partition is only 230MB. A quick google search gives me this one-liner:

$ dpkg -l linux-* | awk '/^ii/{ print $2}' | grep -v -e `uname -r | cut -f1,2 -d"-"` | grep -e [0-9] | xargs sudo apt-get -y purge

Thanks tuxtweats! http://tuxtweaks.com/2010/10/remove-old-kernels-in-ubuntu-with-one-command/

**UPDATE**: You don’t have to do this anymore… You can now just do:

1
$ sudo apt-get autoremove

This will clean up old kernels. The only way this won’t work is if you’re at 100%. Then you have to clear a few old kernels out manually and then resume installation of the latest kernel, then run the autoremove command.