Scaling FreeSWITCH Performance

  • We’re Hiring •  Linux developers C/C++ or Python •  Anywhere in the world, relocaHon paid or full Hme remote opportuniHes •  Fun and relaxed work environment 4
  • Agenda •  Performance Basics •  FreeSWITCH Core Basics •  Performance Tweaks •  Feature Performance Cost •  Final Thoughts 5
  • Performance Basics •  Performance tesHng and measurement is hard to do and very prone to errors •  Performance can change widely from seemingly minor hardware/so?ware changes •  This presentaHon focuses on Linux and SIP call bridging performance 6
  • Performance Basics •  CPU-­‐Bound •  I/O Bound •  Threads, Resource/Lock ContenHon •  You cannot improve what you don’t measure 7
  • FreeSWITCH Core •  FreeSWITCH is an insanely threaded system (the good kind of insane) 8
  • FreeSWITCH Core •  Most threads are I/O bound •  But transcoding, transraHng, tone generaHon introduce CPU-­‐bound elements into the mix •  FreeSWITCH core I/O model is blocking, not async 9
  • FreeSWITCH Core •  Every call leg has its own session thread walking through a state machine, roughly, like this: •  init -­‐> rouHng -­‐> execute app -­‐> hangup -­‐> reporHng -­‐ > destroy 10
  • FreeSWITCH Core •  Monitoring threads per signaling stack (e.g sofia, freetdm) •  These threads are long-­‐lived and perform very specific tasks (e.g process SIP signaling out of a call context, iniHal invite etc) •  Event subsystem launches threads for event dispatch 11
  • FreeSWITCH Core •  Conferences duplicate your use of threads per call leg. For each parHcipant you have 2 threads: •  Session thread (handles call state and media output) •  Input conference thread (launched when joining the conference, reads media from the session) 12
  • FreeSWITCH Core •  Even small features might launch threads •  e.g. Semng Hmer=so? when performing a playback() launches an extra thread to consume media from the session 13
  • Performance Tweaks •  Logging adds stress to the event subsystem •  Every log statement is queued as an event •  Every log statement is delivered to logger modules (syslog, file, console) •  Set core logging level to warning in switch.conf.xml 14
  • Performance Tweaks •  Do not write debug logs to an SSD in a loaded system. You’ll kill the SSD soon J •  If you want to keep debug level, you can put logs into tmpfs and rotate o?en 15
  • Database •  The naHve sqlite core database must go to tmpfs to avoid I/O boplenecks •  On tmpfs however you risk losing SIP registraHon data on a power outage or any sudden restart (e.g kernel panic) •  Most other data is transient (e.g channels, sip dialogs, etc) 16
  • Database •  Eventually you might need to migrate to pgsql, mysql or some other database via odbc •  Allows you to move db workload elsewhere •  Beper performance for applicaHons that read the core info (channels, calls, etc) 17 Sangoma Technologies -­‐ © 2015
  • Database •  Tables such as channels, calls, tasks, sip_dialogs, do not need to persist. You can move those tables to memory (e.g MEMORY engine on MySQL) if you don’t need fault tolerance •  Remember to set auto-­‐create-­‐schemas=false and auto-­‐clear-­‐sql=false if you create the db schema on your own (see switch.conf.xml) 18
  • Database •  If using MySQL: •  Use the InnoDB engine for beper concurrency in data that requires persistence (e.g SIP registraHon) •  innodb_flush_log_at_trx_commit=0 •  sync_binlog=0 19
  • SIP Stack •  Sofia launches the following threads per profile: •  Main profile thread (runs sofia UA stack scheduling) •  Worker thread (checks expired registraHons) •  Stack listener thread (accepHng inbound traffic) •  You can distribute your traffic among more sofia profiles for improved concurrency 20
  • Memory AllocaNon •  FreeSWITCH uses memory pools •  Using modules that depend on libraries or modules not using pools can benefit from using an alternaHve memory allocator 21
  • Memory AllocaNon •  tcmalloc and jemalloc are good alternaHves •  Reports on the mailing list of improvement if using mod_perl •  Sangoma found very significant improvement on its SBC (based on FreeSWITCH) 22 Sangoma Technologies -­‐ © 2015
  • Memory AllocaNon •  Easy to try on your own workload: •  LD_PRELOAD=“libtcmalloc.so.x.x.x« ./freeswitch •  Recommended to run mysql with either tcmalloc or jemalloc 23
  • Dialplan •  Careful planning of your dialplan goes a long way •  Do not enable funcHonality you don’t need, everything has a cost •  Just loading a module might be consuming precious cpu cycles 24
  • Dialplan •  Common performance factors to consider (mind the performance cost of those features): •  Media relay •  Tone DetecHon •  Recording •  Transcoding 25
  • Measurement Tools •  switchy: A distributed load-­‐generator •  hpps:github.com/sangoma/switchy •  vmstat ploper •  hpps:clusterbuffer.wordpress.com/2014/09/21/ vmstat_ploper/ 26 Sangoma Technologies -­‐ © 2015
  • Test Server •  Linux CentOS 6 (kernel 2.6.x) •  FreeSWITCH v1.4 git branch •  Intel Xeon 64bit processor w/ 8 cores •  Intel SSD •  16GB of RAM 27 Sangoma Technologies -­‐ © 2015
  • Test Lab 28
    Test FreeSWITCH Server Switchy Load Generator (FreeSWITCH) Load Generator (FreeSWITCH) ESL ESL SIP SIP
  • 2k@50cps simple audio bridge 29
  • 2k@50cps tone detecNon 30
  • 1k@50cps simple audio bridge 31
  • 1k@50cps session recording 32
  • 1k@50cps transcoding PCMU/G722 33
  • 4k@80cps bypass media 34
  • Dialplan •  Use bypass media selecHvely whenever you can •  Avoid transcoding, use late-­‐negoHaHon and inherit_codec=true •  If you must do transcoding, you can offload to a hardware transcoder 35
  • Final Thoughts •  You have to measure your own work load •  No easy answers with performance, but you have the tools to find what works for you 36 Sangoma Technologies -­‐ © 2015
Только авторизованные участники могут оставлять комментарии.
  • freeswitch/fs_scaling.txt
  • Последние изменения: 2018/05/21