multiple audio devices, multiple calls, conferencing, recording and mix all of the above

ZR
Zeh Rizzatti
Thu, Jan 26, 2012 4:48 AM

Hello, everyone. I have just joined the list, and this will be my very
first e-mail.

I'm developing a softphone using the pjsua API, and Qt on top of that for
the GUI.

The library is absolutely fantastic, and has made my easy in so many ways,
but recently, as the feature set has been steadily increasing, I have
some concerns I'd like to address you.

I'll try to explain some of the requirements and how I've tackled those so
far, so this could become quite long, but bear with me.

=== Multiple audio devices ===

The application requires that a call can be switched between a main
(usually a headset) device and a secondary one (speakerphone mode). Also,
you are able to setup one of the available devices for ring tones, such as
when a remote call first arrives.

  • First approach: using pjsua_set_snd_dev and keep using
    pjsua_conf_connect/disconnect on port 0
    Worked fine for a while, but I noticed that every time a had to actually
    switch between audio devices, the app would hang for a while until it
    became responsive again.
    Also, there have been complaints about sound quality (specially with the
    mic) from early adopters (could be unrelated, but none on 2nd approach).

  • Second approach: using split/comb and creating the required ports on
    initialization
    Performance/responsiveness improved greatly but I must keep references to
    which of the devices is the one I'm supposed to connect the calls to,
    remove the "ringing" device from the conference bridge prior to connecting
    the "headset" device to the call, and things like that.

=== Mutiple calls ===

We support up to 4 calls at the moment. We user can alternate between them,
having a single one active at any time.
Alternating is handled basically by swapping hold/reinvite for the calls.

=== Conference ===

Now, here is where things get quite interesting.

For the sake of simplicity, there can be only ONE conference being held.
Either all (up to 4) calls are in the conference, or we are alternating
between them (in "normal" mode). This way we don't have to worry about
alternating between conference instances, or keeping track of which calls
were in conference with which other calls...
If a new call comes in, or we dial to someone, and we are conferencing at
the moment, the new call is put on the conference.
When we leave conference mode, all calls go back to hold.

What really has intrigued me, and the main reason for the e-mail, was the
amount of connections required in the bridge to do such a thing.
Let's say we have all 4 calls, and we enter a conference. We would require:

  • connect 4 call slots to active audio device
  • connect all 4 calls with one another (being careful not to loop any of
    those)
  • and of course, 2 pjsua_conf_connects for bidirectional media for each of
    the above

So, this would mean 2 * (4 * 1 + 3 * 2 * 1) = 20 pjsua_conf_connect
function calls.

Is that really the way of it, or would there be something simpler to do?
I mean, it does work this way right now, and it's more of a code (and
mental :) sanity concern, specially when you have to handle things like one
of the calls being hanged up, or joining the conference, and
(dis)connecting them all.

=== Recording ===

Last but not least, the feature I implemented today, recording.
Quite simple for the single call scenario, or another 5 potencial
connections when a conference is being held.

This got me thinking... Couldn't there be some sort of component that could
act as the main "media source".
I could connect the recorder to this slot, and anything that was connected
to it would stream directly to the recorder?
Could this also be the case with a call that is plugged to this slot, not
requiring that all calls connect to one another?

Could this be accomplished using a split/comb? Or is there another media
port type that would be better?

If you read all of that, thank you for your patience.
If you have any thoughts and reply, know that you have a place to stay
during brazilian carnival =)

Thanks in advance,

Zeh

Hello, everyone. I have just joined the list, and this will be my very first e-mail. I'm developing a softphone using the pjsua API, and Qt on top of that for the GUI. The library is absolutely fantastic, and has made my easy in so many ways, but recently, as the feature set has been steadily increasing, I have some concerns I'd like to address you. I'll try to explain some of the requirements and how I've tackled those so far, so this could become quite long, but bear with me. === Multiple audio devices === The application requires that a call can be switched between a main (usually a headset) device and a secondary one (speakerphone mode). Also, you are able to setup one of the available devices for ring tones, such as when a remote call first arrives. * First approach: using pjsua_set_snd_dev and keep using pjsua_conf_connect/disconnect on port 0 Worked fine for a while, but I noticed that every time a had to actually switch between audio devices, the app would hang for a while until it became responsive again. Also, there have been complaints about sound quality (specially with the mic) from early adopters (could be unrelated, but none on 2nd approach). * Second approach: using split/comb and creating the required ports on initialization Performance/responsiveness improved greatly but I must keep references to which of the devices is the one I'm supposed to connect the calls to, remove the "ringing" device from the conference bridge prior to connecting the "headset" device to the call, and things like that. === Mutiple calls === We support up to 4 calls at the moment. We user can alternate between them, having a single one active at any time. Alternating is handled basically by swapping hold/reinvite for the calls. === Conference === Now, here is where things get quite interesting. For the sake of simplicity, there can be only ONE conference being held. Either all (up to 4) calls are in the conference, or we are alternating between them (in "normal" mode). This way we don't have to worry about alternating between conference instances, or keeping track of which calls were in conference with which other calls... If a new call comes in, or we dial to someone, and we are conferencing at the moment, the new call is put on the conference. When we leave conference mode, all calls go back to hold. What really has intrigued me, and the main reason for the e-mail, was the amount of connections required in the bridge to do such a thing. Let's say we have all 4 calls, and we enter a conference. We would require: - connect 4 call slots to active audio device - connect all 4 calls with one another (being careful not to loop any of those) - and of course, 2 pjsua_conf_connects for bidirectional media for each of the above So, this would mean 2 * (4 * 1 + 3 * 2 * 1) = 20 pjsua_conf_connect function calls. Is that really the way of it, or would there be something simpler to do? I mean, it does work this way right now, and it's more of a code (and mental :) sanity concern, specially when you have to handle things like one of the calls being hanged up, or joining the conference, and (dis)connecting them all. === Recording === Last but not least, the feature I implemented today, recording. Quite simple for the single call scenario, or another 5 potencial connections when a conference is being held. This got me thinking... Couldn't there be some sort of component that could act as the main "media source". I could connect the recorder to this slot, and anything that was connected to it would stream directly to the recorder? Could this also be the case with a call that is plugged to this slot, not requiring that all calls connect to one another? Could this be accomplished using a split/comb? Or is there another media port type that would be better? If you read all of that, thank you for your patience. If you have any thoughts and reply, know that you have a place to stay during brazilian carnival =) Thanks in advance, Zeh