[USRP-users] In search of 200 MSA/sec (Windows 7)

Neel Pandeya neel.pandeya at ettus.com
Mon Mar 14 17:20:23 EDT 2016


Hello Tilla:

You did just about everything that I might suggest, and your system
certainly sounds powerful enough to handle 200 Msps. The Intel X710 is the
latest 10 GbE card from Intel, and should perform well and be able to
handle this data rate. There are only a few other things that I might
mention. Have you installed the latest Intel X710 driver? Would you be able
to upgrade Windows? Several customers have reported that they were able to
achieve improved performance and throughput by upgrading from Windows 7 to
Windows 8 or 10. I'm sure you're already using SSD disks, and I'm not sure
if you're being limited by disk I/O, but perhaps you could a RAID setup or
a RAM disk? Have you turned off all power management, and disabled ACPI in
the BIOS? And speaking of the BIOS, some customers have also reported that
a BIOS upgrade improved throughput, although your system looks like it's
new, so the BIOS firmware version should be very recent. I agree with
Marcus that the FastSendDatagramThreshold registry setting might be
helpful. I'm not sure if it behaves differently between Windows 7, 8, 10,
and Server 2012. It's the only registry setting tweak that I'm aware of.

Please let us know if you're able to make any further progress, and whether
you see improved results with Windows Server 2012.

--Neel



On 12 March 2016 at 10:43, Marcus Müller <usrp-users at lists.ettus.com> wrote:

> Hi Tilla,
>
> Agreeing, this needs further investigation.
>
> I have done all the basic NIC tuning that is frequently discussed here:
> jumbo packets, disable interrupt coalescing, increase buffer sizes...
>
> I have done a huge amount of other tuning: disable numa, pcie performance
> mode, process affinitized to same cpu NIC is directly connected to,
> hyperthreading disabled, hand optimized compilation, and a boatload of
> other stuff.
>
> In my (Linux) experience, disabling the automatisms of the OS is most of
> the time not beneficial to performance; in fact, I remember a case where
> CPU affinity setting made it hard for the kernel to optimally schedule
> kernel drivers and userland, so that significant increase in performance
> was observed after stopping to set affinity. Of course, trying to set
> affinity still is a very good idea – after all, you have knowledge of the
> application that your OS lacks. I'd definitely leave hyperthreading on;
> it's very rarely the reason for slowdown in programs that seem to be
> IO-bound. You should definitely **enable** interrupt coalescing; no way a
> system is going to keep up if every single packet of a 200MS/s transmission
> causes a hardware interrupt (with 8000B per packet, that'd be 100,000
> interrupts per second...).
>
> The one setting that I don't find in your list is
> ​​
> the Fast Datagram Threshold setting [1]; did you do that?
>
> What I take away from your execution time percentages is that the most
> time is spent in the symbol that actually spent in the managed send buffers
> (see the boost::function1<> line, 38%); looking more closely into that
> would be interesting; can you do that?
> Also, agreeing, there's not that that one can do about the converter;
> don't be fooled by the fact that it also does byte conversions; it moves
> the data out of the network card buffer into the recv()-buffer, and usually
> you hit a memory bandwidth wall there, if you optimize the numerical
> operation enough.
>
>
> Best regards,
> Marcus
>
> [1]
> http://files.ettus.com/manual/page_transport.html#transport_udp_windows ;
> https://raw.githubusercontent.com/EttusResearch/uhd/master/host/utils/FastSendDatagramThreshold.reg
>
>
> On 09.03.2016 17:00, tilla--- via USRP-users wrote:
>
>
> Over the past 2 weeks, I have been working towards achieving 200 MSA/sec
> on win7 64 bit, UHD 3.9.2, X310 w/WBX 120 daughtercard.
>
> I have a pretty good processor, 3.5 GHz Xeon E5-2637 v2, plenty of memory,
> 710-DA2 NIC in a x8 Gen 3 slot with verified Gen3 trained speed, a pretty
> beefy box in general.
>
> I have gathered a bunch of data and was looking for some further thoughts.
>
> I have done all the basic NIC tuning that is frequently discussed here:
> jumbo packets, disable interrupt coalescing, increase buffer sizes...
>
> I have done a huge amount of other tuning: disable numa, pcie performance
> mode, process affinitized to same cpu NIC is directly connected to,
> hyperthreading disabled, hand optimized compilation, and a boatload of
> other stuff.
>
> This is a simple prototype of a transmit only application, 1 thread, 1
> transmit buffer, just send the same buffer as fast as possible.  Transmit
> loop is as simple as possible (bottom of page 2 in attachment).
>
> Attached is some performance data.
>
> 50 MSA/sec, very good performance, occasional underflow, max cpu on a core
> ~35% (Figure 1 screenshots).
>
> 100 MSA/sec, decent performance, more underflows but still "reasonable",
> max CPU on a core ~70% (Figure 2 screenshots).
>
> Handwave observable based upon above numbers: performance is linear, when
> sampling rate doubles, max CPU utilization doubles, perfectly expectable...
>
> Soooooo, now when going to 200 MSA/sec, constant underflows, not much
> transmission at all.
>
> Extrapolating based upon 100 MSA/sec numbers I would need ~140% of a CPU
> :(  (queue Charlie Brown music)
>
> If it is the byteswapping that is the true bottleneck, I am not sure there
> is really anything I can do as it is already SSE.  Unless do something like
> AVX or AVX2...
>
> On the recent thread titled "Throughput of PCIe Interface" we had some
> performance discussions.  I don't have the email available right now, but
> someone claimed 200 MSA/sec working.  I guess I am curious as to the setup.
>
> I would not think that byteswapping performance would vary much across
> operating systems...
>
> So I guess I am looking for any suggestions or thought anyone would have
> related to getting to 200 MSA, short of changing operating systems (for
> now).
>
> I am planning to evaluate Windows Server 2012 Standard in the near future
> if I cannot get this in some form of working, but would like to exhaust all
> options before investing that much time.
>
> Sorry for the long winded email.
>
> Thanks,
>
>
>
> _______________________________________________
> USRP-users mailing listUSRP-users at lists.ettus.comhttp://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>
>
>
> _______________________________________________
> USRP-users mailing list
> USRP-users at lists.ettus.com
> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ettus.com/pipermail/usrp-users_lists.ettus.com/attachments/20160314/79c6c6d3/attachment-0002.html>


More information about the USRP-users mailing list