[USRP-users] rx_samples_to_file issue

Peter Witkowski pwitkowski at gmail.com
Fri Oct 3 11:44:34 EDT 2014

So I'm confused.

You state that if I can't use rx_samples_to_file, my system is failing to
perform as specified to write data out, then you give an example of several
things that can happen to create a stochastic write speed (which I totally
understand and agree with).  Given that writes can be stochastic, why is
there not a software buffer implemented in the UHD sample code to account
for such issues?  I understand that it's meant to be an example, but I've
also seen it referenced as being used effectively as a debugger or test for
people having issues (i.e. recommendation to use the UHD programs in place
of GNURadio to resolve issues).

Also, in terms of benchmarking, I'm quoting minimum values, not averages.
I agree with you that average values are pointless, and in reality the disk
subsystem needs to perform when called up.  My minimum values for a 4 disk
RAID0 with a dedicated controller are well within the data rate that I am

Is there an example system that can handle sustained data capture from the
USRP at (or near the limits) of 10GigE or the PCIe interfaces (maybe the
requirement is enterprise class PCIe SSDs)?  I'm running a two socket Xenon
system (two hex core processors) with 64GB of RAM.  How much more hardware
should I throw at the problem to be able to sample/write at 100MS (half of
what is quoted on the website for bandwidth for the 10GigE kit) using the
provided code?

I think the issue here is that the code itself can't simply get through
it's main loop fast enough.  There's a difference between data bandwidth
and CPU throughput.  The sequential nature of the code means that if any
weird stuff happens (your example was a good set of kernel related hilarity
that can lead to stochastic timing) you will have overflows since you
cannot read fast enough.  This is why a 90% solution for my application was
to just set the dirty_background_ratio to 0 and also why redirection to
/dev/null makes overflows go away.  With either method I didn't have to
wait for a large write cache to flush before moving on to the next read
from the USRP.  Note that there can also be things that happen on the read
side as well.  Does this mean that I can only run the code on an RTOS?

As a final note, my understanding is that GNURadio and the USRP were
developed for domain experts in DSP to use.  These users may or may not
have prior experience in software.  As a result, I'd recommend perhaps
adding a buffered example or have the USRP GNURadio block allow for
buffering.  Otherwise, I just don't see how you can advertise 200 MS/s
(maybe even a simple "buffer" block in GNURadio would do the trick?).  I
understand that this is theoretical limit of the bus, but if there doesn't
exist a driver or other software to make use of this, the practical limit
becomes much, much smaller.

On Fri, Oct 3, 2014 at 10:55 AM, Marcus Müller <usrp-users at lists.ettus.com>

>  I have to agree with Marcus on this. Also, keep in mind that storage is
> really what an operating system should take care of in any "general
> purpose" scenario, ie. that as long as I just write to a file, I'd expect
> that the thing in charge of storage (my kernel / the filesystems / block
> device drivers) does the best it can to keep up. If I find myself in a
> situation where my specific storage needs dictate a huge write buffer,
> changing the application might be one way, but as I'm responsible for my
> won storage subsystem, I could just as well increase the cache buffer
> sizes, and let the operating system handle storage operation. If your RAID
> is really performing as well as it is benchmarked to, then this should not
> be one of your problems. All rx_samples_to_file does is really sequentially
> writing out data at a constant rate, which is the most basic write
> benchmark I can think of.
> If your storage subsystem (filesystem + storage abstraction + raid driver
> + interface driver + hard drive interface + hard drives + hardware caches)
> can't keep up, it's failing to perform as specified, simple as that. In
> this case, saying that the application needs to be smarter when dealing
> with storage seems like a bit of a cop-out to me ;)
> I'd like to point out that most benchmarks use heavily averaged numbers
> for write speeds etc. UHD on the other hand kind of demands soft real-time
> performance of a write subsystem, which is a lot harder to fulfill. This
> comes up rather frequently, but I have to stress it: you need a fast
> guaranteed write rate, not only an average one, and as soon as your
> operating system has to postpone writing data[1], it has to have enough
> performance to catch up whilst still meeting continued demand. This is
> general purpose hardware running general purpose OS with dozens of
> processes, and you can't just say "every single component is up to my task,
> thus my system suffices", because everything potentially blocks everything!
> Greetings,
> Marcus
> [1] e.g. because the filesystems needs to calculate checksums, update
> tables, another process gets scheduled, a device blocks your PCIe bus, your
> platters randomly need a bit longer to seek, you reach the physical end of
> an LVM volume and have to move across a disk, an interrupt does what an
> interrupt does, some process is getting noticed on a changing file
> descriptor, DBUS is happening in the kernel, token ring has run out of
> tokens, thermal throttling, bitflips on SATA leading to retransmission,
> some page getting fetched from swap...
> On 03.10.2014 15:34, Marcus D. Leech via USRP-users wrote:
> One has to keep firmly in mind that programs like rx_samples_to_file are
> *examples* that show how to use
>  the underlying UHD API. They are not necessarily optimized for all
> situations, and indeed, one could
>  restructure rx_samples_to_file to decouple UHD I/O from filesystem I/O,
> using a large buffer between them.
> The fact is that dynamic performance of high-speed, real-time, flows is
> something that almost-invariably needs
>  tweaking for any particular situation. There's no way for an example
> application to meet all those requirements.
> But the fact also remains that for *some* systems, rx_samples_to_file
> (and uhd_rx_cfile on the Gnu Radio side)
>  are able to stream high-speed data just fine as-is.
> On 2014-10-03 09:26, Peter Witkowski via USRP-users wrote:
>  To say that the issue is just because the disk subsystem can't keep up is a bit of cop-out.
> I had issues writing to disk when the incoming stream was 400MB/s and my RAID0 system was benchmarked at being much higher than that.
> The issue that I've been seeing stems from the fact that it appears that you cannot concurrently read/write from the data stream as its coming in. In effect you have a main loop that reads from the device and then immediately tries to write that buffer to file. If you do not complete these operations in a timely fashion overflows occur.
> One way to solve (or at least band aid the issue) is to set your dirty_background_ratio to 0. I was able to get writing to disk working somewhat with this setting as it is more predictable to directly write to disk instead of having your write cache fill up and then having a large amount of data to push to disk. That said, my RAID0 array is capable of such speeds and even then I was getting a few (but much reduced) overflows.
> The one surefire way I know of getting this working (even on a slow disk system) is to buffer the data. The buffer can then be consumed by the disk writing process while being concurrently added onto by the device reader. The easiest way to test buffering (that I've found) is to simply set up a GNURadio Companion program with a stream-to-vector block between the USRP and file sink blocks. This is exactly what I am doing currently since even with a very powerful system, I could not get data saved to disk quickly enough given the aforementioned issues with the provided UHD software.
> On Thu, Oct 2, 2014 at 11:48 PM, gsmandvoip via USRP-users <usrp-users at lists.ettus.com> <usrp-users at lists.ettus.com> wrote:
> Thanks Marcus for your replies. Yes O gone away.
> On Thu, Oct 2, 2014 at 5:50 PM, Marcus D. Leech <mleech at ripnet.com> <mleech at ripnet.com> wrote:
> with rx_samples_to_file without _4rx.rbf, Initially I tried on my i3, 4GB ram, it gave me
> some OOOO but was lesser than earlier, but I do not understand, my most of the ram capacity and processor was sitting idle while it shows OOOO, why is this strange behaviour The default format for uhd_rx_cfile is complex-float, thus doubling the amount of data written compared to rx_samples_to_file.
> You can't just use CPU usage as an indicator of loading--if you're writing to disk, the disk subsystem may be much slower than you think, so the
> "rate limiting step" is writes to the disk, not computational elements.
> Try using /dev/null as the file that you write to. If the 'O' go away, even at higher sampling rates, then it's your disk subsystem.
> using uhd_rx_cfile getting similar result, but strangely, why it is low, at 4M sampling rate it was higher???
> On Thu, Oct 2, 2014 at 9:27 AM, Marcus D. Leech <mleech at ripnet.com> <mleech at ripnet.com> wrote:
> On 10/01/2014 11:46 PM, gsmandvoip wrote:
> Yes I am running single channel, but when trying to achieve my desired sampling rate without _4rx.rbf, it says, requested sampling rate is not valid, adjusting to some 3.9M or so. sorry for misleading info I gave earlier, I have i3, with 32 bit and i7 with 64 bit, but getting same result on both machines
> Here is my command to capture signal:
> ./rx_samples_to_file --args="fpga=usrp1_fpga_4rx.rbf, subdev=DBSRX" --freq "$FC" --rate="$SR" $FILE --nsamps "$NSAMPLES"
> and here is its output:
> Creating the usrp device with: fpga=usrp1_fpga_4rx.rbf, subdev=DBSRX...
> -- Loading firmware image: /usr/share/uhd/images/usrp1_fw.ihx... done
> -- Opening a USRP1 device...
> -- Loading FPGA image: /usr/share/uhd/images/usrp1_
> fpga_4rx.rbf... done
> -- Using FPGA clock rate of 52.000000MHz...
> The user specified 1 channels, but there are only 0 tx dsps on mboard 0.
> Don't use the _4rx image if you don't need it.
> The USRP1 only does strict-integer resampling, and with a master clock (NON STANDARD FOR USRP1) of 52.000MHz, 4Msps is not a sample rate
> that it can produce. Try 5.2Msps or 4.3333Msps.
> At 5.2Msps, it's recording at roughly 20.8Mbytes/second, so your system needs to be able to sustain that for at least as long as the capture lasts.
> _______________________________________________
> USRP-users mailing listUSRP-users at lists.ettus.comhttp://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
> _______________________________________________
> USRP-users mailing list
> USRP-users at lists.ettus.com
> http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

Peter Witkowski
pwitkowski at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ettus.com/pipermail/usrp-users_lists.ettus.com/attachments/20141003/e87a49f9/attachment-0002.html>

More information about the USRP-users mailing list