ruby on rails - Puma or Unicorn VS Webbrick load test benchmark shows no improvement -


the setup

ok, running rails application on heroku(free tier).

i have 2 separate application releases, lets call them staging , fake-production.

in staging, using webbrick server. procfile

web: rails s -p $port 

in fake-production, using puma server. procfile is

bundle exec puma -c config/puma.rb 

i have configured puma run 2 workers , 1 thread per worker. config/puma.rb defined below (taken heroku's setting puma webserver)

workers integer(env['web_concurrency'] || 2) threads_count = integer(env['max_threads'] || 1) threads threads_count, threads_count  preload_app!  rackup      defaultrackup port        env['port']     || 3000 environment env['rack_env'] || 'development'  on_worker_boot   # worker specific setup rails 4.1+   # see: https://devcenter.heroku.com/articles/deploying-rails-applications-    with-the-puma-web-server#on-worker-boot   activerecord::base.establish_connection end 

my database.yml configured have connection pool of 20.

the test

in order load testing, used apachebench tool laptop hit api endpoint. api simple database query return fixed amount of records (does not change).

i hit both deployments following code:

ab -n 1000 -c 100 https://<some heroku endpoint>?access_token=f73f50514c 

the results

the results here surprising. expecting puma deployment comepletely trash webbrick deployment, in reality, same. tried hitting different api endpoints different combination of puma workers , threads (at 1 point, 4 workers , 5 threads) , yet there wasnt visible improvements.

webbrick results

server software:        webrick/1.3.1 server hostname:        webbrick-build.herokuapp.com server port:            443 ssl/tls protocol:       tlsv1,dhe-rsa-aes128-sha,2048,128  document path:          /api/v1/packages?access_token=f73f50514c6 document length:        488 bytes  concurrency level:      100 time taken tests:   21.484 seconds complete requests:      1000 failed requests:        0 total transferred:      995000 bytes html transferred:       488000 bytes requests per second:    46.55 [#/sec] (mean) time per request:       2148.360 [ms] (mean) time per request:       21.484 [ms] (mean, across concurrent requests) transfer rate:          45.23 [kbytes/sec] received  connection times (ms)               min  mean[+/-sd] median   max connect:      714 1242 278.1   1214    2012 processing:   248  842 493.6    699    2883 waiting:      247  809 492.3    677    2876 total:       1072 2085 643.5   1929    4845  percentage of requests served within time (ms)   50%   1929   66%   2039   75%   2109   80%   2168   90%   2622   95%   3821   98%   4473   99%   4646  100%   4845 (longest request) 

memory impact

source=web.1 dyno=heroku.1234567899 sample#memory_total=198.41mb sample#memory_rss=197.60mb sample#memory_cache=0.30mb sample#memory_swap=0.51mb sample#memory_pgpgin=103879pages sample#memory_pgpgout=53216pages 

puma results(more or less same regardless of worker/thread count)

server software:        cowboy server hostname:        puma-build.herokuapp.com server port:            443 ssl/tls protocol:       tlsv1,dhe-rsa-aes128-sha,2048,128  document path:          /api/v1/packages?access_token=fb7168c147adc2ccd83b2 document length:        489 bytes  concurrency level:      100 time taken tests:   23.299 seconds complete requests:      1000 failed requests:        0 total transferred:      943000 bytes html transferred:       489000 bytes requests per second:    42.92 [#/sec] (mean) time per request:       2329.949 [ms] (mean) time per request:       23.299 [ms] (mean, across concurrent requests) transfer rate:          39.52 [kbytes/sec] received  connection times (ms)               min  mean[+/-sd] median   max connect:      743 1304 283.9   1287    2092 processing:   253  951 740.3    684    5353 waiting:      253  898 729.0    627    5196 total:       1198 2255 888.0   1995    7426  percentage of requests served within time (ms)   50%   1995   66%   2085   75%   2213   80%   2444   90%   3755   95%   4238   98%   5119   99%   5437  100%   7426 (longest request) 

memory impact(4 workers, 5 threads)

source=web.1 dyno=heroku.1234567890 sample#memory_total=406.75mb sample#memory_rss=406.74mb sample#memory_cache=0.00mb sample#memory_swap=0.00mb sample#memory_pgpgin=151515pages sample#memory_pgpgout=47388pages 

based on snippets above, puma deployment faster webbrick, while other times can slower (as shown in snippet). if faster, speed not significant, increase 1-5 requests/sec.

my question here is, doing wrong? database pool somehow @ fault?am benchmarking wrongly? using puma wrongly?

edit:

highest cpu load puma (5 worker , 5 threads each)

source=web.1 dyno=heroku.123456789 sample#load_avg_1m=2.98 

most of time however, either 0.00 or smaller 0.1.

on top of that, code called in controller is:

@package = package.all 

immediately after, followed rendering of json response declared in haml.

btw, package.all returns 5 records.

edit 2:

unicorn results

implemented unicorn according . running 3 unicorn workers.

server software:        cowboy server hostname:        unicorn-build.herokuapp.com server port:            443 ssl/tls protocol:       tlsv1,dhe-rsa-aes128-sha,2048,128  document path:          /api/v1/packages?access_token=f73f50514c6b8a3ea document length:        488 bytes  concurrency level:      100 time taken tests:   22.311 seconds complete requests:      1000 failed requests:        0 total transferred:      942000 bytes html transferred:       488000 bytes requests per second:    44.82 [#/sec] (mean) time per request:       2231.135 [ms] (mean) time per request:       22.311 [ms] (mean, across concurrent requests) transfer rate:          41.23 [kbytes/sec] received  connection times (ms)               min  mean[+/-sd] median   max connect:      846 1326 294.5   1304    2720 processing:   245  627 342.8    540    3061 waiting:      244  532 313.6    470    3057 total:       1232 1954 463.0   1874    4875  percentage of requests served within time (ms)   50%   1874   66%   2016   75%   2161   80%   2250   90%   2466   95%   2799   98%   3137   99%   3901  100%   4875 (longest request) 

one thing ive noticed running same ab load test code several times return different "requests per seconds". applies both unicorn , puma. both unicorn , puma, best "requests per seconds" 48-50 while worst 25-33.

either way, still not make sense. why are'nt either puma or unicorn crushing webbrick?

i trust have followed heroku's deploying rails applications puma web server guide thoroughly.

my guess test environment minimizes multi-threading advantages, or http server bottle-necked sql database.

your api calls, if cache database results, can cpu intensive. having 10 threads no advantage when cpu used 100% 1. managing threads can hinder performance in case.

multi-threading useful when worker threads waiting long time resources (databases, files, etc.), instead of using cpu.

the second possibility http server constrained database. may webrick moving fast database allowing it, leaving no room improvement switching better performing http server.

you should give this comprehensive benchmark report read.

you notice puma isn't 1 of speediest rails http servers. if care speed, try unicorn, or torquebox 4 if using jruby.

here's guide on how setup unicorn on heroku.


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -

javascript - Get parameter of GET request -