[USRP-users] Hardware selection for X310-based system

Marcus Müller marcus.mueller at ettus.com
Fri Jul 24 12:49:22 EDT 2015


Vladimir,

Quite the same for the RX. Again, as soon as start implementing your own stuff on the FPGA, there's basically no limit on how you distribute usage of the DDR RAM. There is boo difference between x310 and x300.
>From a support engineer's perspective, I'd suggest not changing the RX buffering until you're sure you'll have to: no matter how large you make it, if you need to receive continuously, then you're computer needs to be fast enough anyway, so even the largest buffer would not save you from missed signal if your PC is not up to the task. Also, it's relatively easy to implement layers of buffering on general PC hardware, so if for some reason you have a hiccup in your signal processing, you'd just use another buffer. Modern CPUs typically have CPU cores waiting for IO, so just having a thread that continuously shuffles data from the USRP to an intermediate pool of buffers is pretty much free from a computing point of view, and thanks to the thread safety of the recv() function of the UHD API, that's the typical receive application's architecture, anyway. Things do scale pretty well on multicore CPUs with UHD.

Best regards,
Marcus

Am 24. Juli 2015 18:29:27 MESZ, schrieb Vladimir via USRP-users <usrp-users at lists.ettus.com>:
> Ian,
>
>Thank you for the explanation! And what about Rx - is it also buffered
>similarly? We are mainly concerned about Rx buffering, since it
>probably can compensate some hitches in ethernet/pcie interface. The
>question is: does X310 have more memory that we could use to extend Rx
>buffer? How much is this difference, comparing to X300? We have to
>select one of them and need to estimate, how big is the difference from
>this point.
>
>Vladimir
>
>
>>Пятница, 24 июля 2015, 8:54 -07:00 от Ian Buckley
><ianb at ionconcepts.com>:
>>
>>Vladimir, 
>>both X300 and X310 using the factory FPGA images offer approximately
>520K Bytes of SRAM FIFO that buffers Tx data before transmission. This
>buffer decouples the host and transport with it's inherent scheduling
>latency and jitter from the hard real-time nature of the DAC's
>consuming data. That amount is empirically enough to run practical
>applications at the highest sampling rates possible on the X300,
>clearly at much lower sampling rates it's overkill. This is actually an
>easily altered parameter during the build process, but typical users
>would not alter it, there is very little motivation. Very few
>applications can make practical use of over large FIFO's for Tx because
>typical radio's do not know in advance when and what they will be
>transmitting.
>>We have a build option for the FPGA that uses DRAM instead of SRAM to
>build the large buffers but it's not currently able to reliably sustain
>throughput in the corner cases that can occur at the highest MIMO
>sample rates and is thus still the subject of development effort. The
>DRAM is far larger than any practical Tx FIFO's would need to be and it
>is included for the flexibility of FPGA developers who may use it for
>all sorts of things. Expect to see some RFNoC blocks using it for
>example.
>>
>>-Ian
>>
>>On Jul 24, 2015, at 7:05 AM, Vladimir via USRP-users <
>usrp-users at lists.ettus.com > wrote:
>>>Marcus,
>>>
>>>To clarify the purpose of this question: if we get X310, will we have
>ability (maybe by modifying firmware) to have larger transfer buffer,
>and how much can be increase in size comparing to X300?
>>>
>>>I didn't realize that X3X0 has 1GB DRAM (Shown on the diagram in
>http://www.ettus.com/content/files/X300_X310_Spec_Sheet.pdf ). Is this
>the DRAM that you mentioned? Can you please summarize all the memory in
>both units?
>>>
>>>E.g., is this correct: both X300 & X310 have 1 GB DRAM, and SRAM size
>differs according the FPGA model (16,020 Kb and 28,620 Kb as specified
>here: http://www.ettus.com/product/details/X310-KIT )?
>>>
>>>Theoretically, if we alter FPGA firmware, can we increase xfer
>buffers, or their sizes are hardware defined? Didn't find the info
>about those two firmware versions you spoke about - where to read on
>this, or what is the difference btw them?
>>>
>>>Thanks!
>>>Vladimir
>>>
>>>>Пятница, 24 июля 2015, 13:37 +02:00 от Marcus Müller via USRP-users
>< usrp-users at lists.ettus.com >:
>>>>
>>>>Hi Vladimir,
>>>>
>>>>no, from the computer's point of view, these devices are really
>    identical.
>>>>
>>>>There is some buffering on the X310 and X300, and as far as I know
>   they are absolutely identical (I'd have to read the FPGA source code
>    [1]), yes, but it's not large enough to buffer seconds of sample
>    data -- that would inherently lead to a lot of latency, and in
>   almost all applications, that would be unwanted. We actually do have
>    two different stock images for the FPGA, one using only SRAM (HGS),
>    and one using DRAM (XGS); the first is the default, and most
>    customers use it.
>>>>
>>>>Regarding accurate clocking: Both PCIe and 10GE are
>    packet/burst-based, so there's some kind of inevitable buffering;
>    however, the USRP keeps it's own reference oscillator-disciplined
>    time, and gives you timestamps with the RX samples, as well as
>   allowing you to specify the exact time when a sample should be sent,
>    or a command (such as tuning, setting gain) should be exectuted.
>    That's why in many cases, the interface latency doesn't matter at
>    all -- you just make sure that the first bunch of samples of your
>    transmission burst is sent "early enough" to the USRP, and add a
>    timespec telling the USRP to hold these samples.
>>>>
>>>>Setting the device time can be done e.g. by a GPSDO, or via the
>    interface on the edge of a PPS (pulse-per-second) signal (doesn't
>    really need to be a pulse per second if you really just want to set
>    the device time once -- a pull-up switch does work ;) ), or without
>    external time reference by just instructing the X3x0 to set a new
>    time in its timing registers.
>>>>
>>>>All this is done via standard UHD API calls.
>>>>
>>>>Best regards,
>>>>Marcus
>>>>
>>>>[1] 
>https://github.com/EttusResearch/fpga/tree/master/usrp3/top/x300
>>>>On 24.07.2015 13:19, Vladimir via
>      USRP-users wrote:
>>>>>Marcus,
>>>>>
>>>>>Yes, I understand this, but does it affect the connectivity with
>      host, or its used only for signal processing algorithms? May be
>      this is another question, but does X300 do any IO buffering at
>      all, to provide accurate clocking of the samples when interfacing
>      e.g. with ethernet? Is there some buffer there and how big is it?
>>>>>
>>>>>Thanks,
>>>>>Vladimir
>>>>>
>>>>>>Пятница, 24 июля 2015, 13:09 +02:00 от
>        Marcus Müller via USRP-users <usrp-users at lists.ettus.com> :
>>>>>>
>>>>>>The difference
>               is that the X310 has the bigger FPGA, which only makes a
>                difference if you want to do custom processing on it --
>                for example, building a largish RF NoC application.
>                Otherwise, X300 and X310 are absolutely identical.
>>>>>>
>>>>>>Best regards,
>>>>>>Marcus
>>>>>>
>>>>>>On 24.07.2015 13:05, Vladimir via USRP-users wrote:
>>>>>>>Hi guys,
>>>>>>>
>>>>>>>I wish to thank all of you for valuable information,
>                  it's very helpful for us! Looking at all this, seems
>                  that we could start from 10 GBE link and then see if
>                  we need more latency or whatever else to try PCIe.
>>>>>>>
>>>>>>>I forgot one question yesterday. From the point that
>                 has been discussed, how big is the difference btw X300
>                  and X310? One of the diffs is that X310 has more
>                 memory - ~3.5 MB if I don't mistake, comparing to 2 MB
>                  in X300. Does it use this memory to do some sort of
>                 buffering when transferring data to/from host? Will it
>                 help to withstand any short hitches/delays that happen
>                  during streaming? For now, it's the only reason I
>                  could imagine to prefer X310 from this point of view.
>>>>>>>
>>>>>>>Again, thank you very much for the info!
>>>>>>>Vladimir
>>>>>>>
>>>>>>>>Пятница,
>                    24 июля 2015, 1:22 +02:00 от Marcus Müller via
>                    USRP-users <usrp-users at lists.ettus.com> :
>>>>>>>>
>>>>>>>>Hi Vladimir,
>>>>>>>>
>>>>>>>>although he claimed not to answer your
>                            question, I think Rob did an excellent job
>                            of providing the info you've asked for;
>                            thank you!
>>>>>>>>
>>>>>>>>Hence, I'll keep this as short as possible
>>>>>>>>On 23.07.2015 22:31, Vladimir via USRP-users
>                            wrote:
>>>>>>>>>1. We already have an Intel X520-DA2 board which Ettus
>recommends for 10GBE input. Would it suffice to order just 783343-01
>(10 Gb SFP+ ETHERNET CABLE, 1M) to X310 kit to get complete setup to be
>able to stream data into/from PC? That, or any direct attach SFP+ 10GE
>cable,
>                            or a pair of transceivers for whatever
>                            electrical or optical medium you plan to
>                            use.
>>>>>>>>>2. From what I read earlier here and saw at Ettus website, I
>understand that PCI-Express Connectivity Kit (PCIe – Desktop) is NI
>PCIe-8371 board with the cord. NI states its speed to be 798 MB/s. But
>to stream 100 MS/s we need 800 MB/s. Is their number - 798 -
>approximate (which would look kind of strange for me given its
>precision :)) and this won't in practice lead to speed problems, or do
>they actually mean 798 MiB/s, or some other explanation? Could anyone
>shed some light on this? I must admit that this is news to me. But
>                            yes, PCIe is not the bus of choice for
>                            maximum throughput. In fact, most PCIe
>                            10Gigabit network adapters, and especially
>                            the Intel devices, have drivers that are
>                            very good of distributing work across
>                            multiple CPU cores and reducing interrupt
>                            pressure -- it's in fact progressively hard
>                            for some customers to get high rates
>                            (>ca. 150MS/s total) with the PCIe
>                            interface, mainly because the NI driver for
>                            compatibility reasons is single threaded,
>                            and a single core can only spin so fast.
>>>>>>>>>3. Would it make sense to look at the NI-recommended
>alternative - 8381 board? Given that the price difference on NI site
>for 8371 and 8381 is just $50, is it a good idea to go with 8381 to be
>sure it definitely won't be a bottleneck? Or it was not tested with
>X310 and we might encounter some compatibility issues, or any other
>reasons against it? If it was tested, can we order it from Ettus
>instead of 8371 at comparable price? I must admit I'm really not overly
>familiar
>                            with the NI side of PCIe / PXI backend
>equipment. Generally, you only need to use either 10GE  or PCIe. Use
>PCIe if you want
>                            to combine things into large
>                           LabVIEW-commanded clusters of USRPs  [1], or
>                           if you know that you need the possibility to
>                            reduce USRP-host latency of the expense of
>                            more CPU consumption. For the general user,
>                            I recommend 10GE.
>>>>>>>>>4. Do I get it right that if we encounter problems on 100 MS/s,
>the next sampling rate we can use would be 50 MS/s, as we must use
>integer decimation values from the max one? And in this case we'll get
>100 MHz bandwidth and limit X310 potential of 120 MHz. Am I correct
>here? You're quite right. So the point is that you
>                            can only use integer factors of the master
>                            clock rate as sampling rates. For the
>                            default Master Clock Rate of 200MHz, these
>                            are 200MS/s, 100MS/s, (66.666..MS/s),
>                            50MS/s, ..., but there is also the 120MS/s
>                            master clock rate, which is predominantly
>                            used by LabView, and can hence be used to
>                            get sampling rates of 120, 60, (40), 30
>                            MS/s. Also, we have an cellular-friendly
>                            184.32 MHz master clock rate. Notice that I
>                            put some potential sampling rates into
>                            parentheses; these are odd decimations of
>                            the master clock rate, which hence don't
>                            exhibit the same quality of
>                            flatness/anti-aliasing as the even
>                            decimations.
>>>>>>>>>5. Am I coerrect that the frequency accuracy of stock X310
>oscillator is much better than that of previous models, so we don't
>need external clock if we don't plan to sync several devices, work at
>large distance, etc? Uh, to be honest, I'm not quite sure about
>                            the "much better" part. "much better"
>                            depends very much on your needs; if you
>                           couldn't use e.g. the N210 for its frequency
>                            offset or drift being much too large, then
>                            you'll want to use an external reference
>                           here, too. However, most customers are quite
>                            happy with the oscillators we have on board
>                            -- simply because frequency offset is a
>                            given fact in digital communications, and
>                           typically, the oscillator outperforms things
>                            like mobile terminals very significantly.
>                            This doesn't necessary apply to things like
>                            high-performance interferometry, or very
>                            accurate signal detection.
>>>>>>>>>It won't be used for any 'real-world' cases like network
>deployment etc, - still, speaking about GSM/UMTS lab experiments (like
>connecting one-two MS in the room), may we need additional clock
>source? No. Base Stations/eNodeBs define the clock
>                            of the cells they supply, so you don't have
>                           to worry. The oscillators really aren't that
>                            bad -- they just aren't perfect.
>>>>>>>>>Am I missing any critical parts to build up the working SDR
>set? Well, you mentioned using UBX-160, but you can use both
>daughterboard slots with one
>                            daughterboard each, allowing you to do 2x2
>                            MIMO.
>>>>>>>>A few typical frequently answered questions'
>                            answers:
>>>>>>>>* all RF interfaces are 50 Ohm matched
>>>>>>>>* the output of the UBX daughterboard can go
>                            up to roughly 20dBm; for details, see
>                            [2]; safe input range is -infty to -15dBm.
>>>>>>>>* UHD itself doesn't need any special
>                            privileges and runs completely in userland.
>                            Only the network card or PCIe bridge card
>                            have kernel drivers.
>>>>>>>>* Ettus support will be on your side.
>>>>>>>>
>>>>>>>>>I would also like to hear from people who tried to get about
>100-120 MHz RX-TX bandwidth from X3X0, e.g. just simply streaming at
>100 MS/s to/from disk and doing processing offline later  - which is
>the most stable and reliable way to do it - 10GBE or PCIe? In my
>experience: 10GE. The problem here
>                            definitely is, as Rob mentioned, that it's
>                            in no way "simple" to save 2x 100MS/s to
>                            disk. There's been (and will always be, and
>                            that's a good and just thing) long
>                            discussions on the needs, feasibility and
>                           example implementation support of sustaining
>                            such hard drive write rates.
>>>>>>>>A quick calculation shows that, for 16bit
>                           short integer fixed point complex recording:
>>>>>>>>datarate(100MS/s) = 16bit/number *
>                            2numbers/complex *1complex/S * 100MS/s
>                            =3.2Gb/s for a single channel!
>>>>>>>>Now, you need to guarantee these
>                            rates, meaning, that unless you can come up
>                            with ridiculous amount of write buffering,
>                            the hard drive minimum write rate must be
>                            above this value, and that doesn't even
>                            account for the fact that data is typically
>                            written to file systems that add even
>                            further overhead and latency. Now, I just
>                            came up with this less than reliable
>                            source[2]; bad news is that the fastest hdd
>                            has an average write rate of 193MB/s
>                            (ca 8*200Mb/s = 1.6Gb/s) according to their
>                            benchmark -- meaning that the minimum rate
>                            will be significantly lower. So even a raid
>                            of 2 or 3 hard drives will not be
>                            sufficient, and of four only if there
>                            weren't such things as seek latency. In
>                            short: Writing that bulk of data to disk in
>                            realtime isn't easy anymore; but RAM disks
>                            work fine, so be sure to have enough RAM.
>>>>>>>>
>>>>>>>>Best regards,
>>>>>>>>Marcus Müller
>>>>>>>>
>>>>>>>>[1] http://www.ni.com/white-paper/52382/en/
>>>>>>>>[2] 
>http://files.ettus.com/performance_data/ubx/UBX-without-UHD-corrections.pdf
>>>>>>>>[3]  http://hdd.userbenchmark.com/
>>>>>>>>_______________________________________________
>>>>>>>>USRP-users mailing list
>>>>>>>>USRP-users at lists.ettus.com
>>>>>>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>_______________________________________________
>USRP-users mailing listUSRP-users at lists.ettus.com
>>>>>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>>>>>
>>>>>>_______________________________________________
>>>>>>USRP-users mailing list
>>>>>>USRP-users at lists.ettus.com
>>>>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>_______________________________________________
>USRP-users mailing listUSRP-users at lists.ettus.com
>>>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>>>
>>>>_______________________________________________
>>>>USRP-users mailing list
>>>>USRP-users at lists.ettus.com
>>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>>
>>>
>>>
>>>
>>>----------------------------------------------------------------------
>>>
>>>
>>>_______________________________________________
>>>USRP-users mailing list
>>>USRP-users at lists.ettus.com
>>>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com
>>
>
>
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>USRP-users mailing list
>USRP-users at lists.ettus.com
>http://lists.ettus.com/mailman/listinfo/usrp-users_lists.ettus.com

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ettus.com/pipermail/usrp-users_lists.ettus.com/attachments/20150724/60d53a93/attachment-0002.html>


More information about the USRP-users mailing list