Discussion and technical support related to USRP, UHD, RFNoC
View all threadsHi,
on two different computers, that both have identical hardware specs we
encountered stability issues with the N310 receiver. Here are the hardware
specs for the computers, though we strongly suspect that the issue is with
the software(kernel/drivers) present on the N310.
The issue is that basically everything runs perfectly well until at some
point it does not anymore and the device runs into a kernel panic and
reboots. From our understanding of the linux kernel it must be inside some
interrupt, because the kernel panic is not written to any logfile (we made
logs permanent on the device to ensure that fact). It is sometimes issued
over SSH if we monitor with tail -f /var/log/messages (yes, we see it
there, but it is not written to disk). This only happens if the N310 runs
for a longer period of time - in our case between 1 to 3 days. We
encountered that issue several times now and we verified it with the
standard FPGA image.
Our problem seems to be related (though with another device) to the issue
that was mentioned in January in this thread:
http://ettus.80997.x6.nabble.com/USRP-users-Kernel-Panic-with-v3-15-0-0-on-E320-td14098.html
Aside from the snippet below there are no other messages printed to the
messages log file.
--Snippet from /var/log/messages--
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.450921] Unable to
handle kernel paging request at virtual address fffffffe
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.458127] pgd =
d3d33249
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.460785]
[fffffffe] *pgd=2fffd861, *pte=00000000, *ppte=00000000
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.emerg kernel: [82689.467121] Internal
error: Oops: 80000007 [#1] PREEMPT SMP ARM
--Specs--
Processor: AMD Ryzen Threadripper 3970X 32-Core Processor
RAM: 256 GB RAM
Network device: Intel X710-DA2 10GbE with SFP+ direct attach cables
Direct attach cables: Cisco H10GB-10GB-CU3M
The N310 device has the latest stable firmware/fpga image for UHD 3.15-LTS
as of today:
Mender: n3xx/meta-ettus-v3.15.0.0/n3xx_common_mender_default-v3.15.0.0.zip
FPGA: n3xx/fpga-9ba275de0b/n3xx_n310_fpga_default-g9ba275de.zip
we use the XG FPGA image present in the zipfile.
Our test flowgraph with the standard FPGA image only features two radio
blocks that stream at a master_clock_rate of 122.88 Mhz and are each
connected to a DDC that decimates by a factor of 2 (though a factor of 3
and 4 lead to the same issue) and then both connect to a null sink in
gnuradio.
We would appreciate anyone looking into reproducing that or any ideas how
to resolve the issue.
Kind regards,
Peter
Hi,
I've been dealing with that issue for some time now but...
finally noticed that the uboot image seems to tell the N310 that it has 4
GB of RAM.
The output of free -m tells me that it does not have 4GB of RAM. It has 1
GB.
There were some people on the Xilinx forums that had a similar problem with
kernel panics because their uboot device tree configuration specified this:
memory@0{
device_type = "memory";
reg = <0x0 0x40000000>
}
So if i hexdump -C /proc/device-tree/memory/reg i get: 00 00 00 00 04 00 00
00
As i don't really have experience with configuring uboot, someone knows an
easy way to change that?
Kind regards,
Peter
Am Mi., 26. Aug. 2020 um 17:09 Uhr schrieb Peter Langer <
peter.langer41@googlemail.com>:
Hi,
on two different computers, that both have identical hardware specs we
encountered stability issues with the N310 receiver. Here are the hardware
specs for the computers, though we strongly suspect that the issue is with
the software(kernel/drivers) present on the N310.
The issue is that basically everything runs perfectly well until at some
point it does not anymore and the device runs into a kernel panic and
reboots. From our understanding of the linux kernel it must be inside some
interrupt, because the kernel panic is not written to any logfile (we made
logs permanent on the device to ensure that fact). It is sometimes issued
over SSH if we monitor with tail -f /var/log/messages (yes, we see it
there, but it is not written to disk). This only happens if the N310 runs
for a longer period of time - in our case between 1 to 3 days. We
encountered that issue several times now and we verified it with the
standard FPGA image.
Our problem seems to be related (though with another device) to the issue
that was mentioned in January in this thread:
http://ettus.80997.x6.nabble.com/USRP-users-Kernel-Panic-with-v3-15-0-0-on-E320-td14098.html
Aside from the snippet below there are no other messages printed to the
messages log file.
--Snippet from /var/log/messages--
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.450921] Unable
to handle kernel paging request at virtual address fffffffe
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.458127] pgd =
d3d33249
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.alert kernel: [82689.460785]
[fffffffe] *pgd=2fffd861, *pte=00000000, *ppte=00000000
Aug 25 07:44:31 ni-n3xx-31AFFD1 kern.emerg kernel: [82689.467121] Internal
error: Oops: 80000007 [#1] PREEMPT SMP ARM
--Specs--
Processor: AMD Ryzen Threadripper 3970X 32-Core Processor
RAM: 256 GB RAM
Network device: Intel X710-DA2 10GbE with SFP+ direct attach cables
Direct attach cables: Cisco H10GB-10GB-CU3M
The N310 device has the latest stable firmware/fpga image for UHD 3.15-LTS
as of today:
Mender: n3xx/meta-ettus-v3.15.0.0/n3xx_common_mender_default-v3.15.0.0.zip
FPGA: n3xx/fpga-9ba275de0b/n3xx_n310_fpga_default-g9ba275de.zip
we use the XG FPGA image present in the zipfile.
Our test flowgraph with the standard FPGA image only features two radio
blocks that stream at a master_clock_rate of 122.88 Mhz and are each
connected to a DDC that decimates by a factor of 2 (though a factor of 3
and 4 lead to the same issue) and then both connect to a null sink in
gnuradio.
We would appreciate anyone looking into reproducing that or any ideas how
to resolve the issue.
Kind regards,
Peter
Am Fr., 28. Aug. 2020 um 13:30 Uhr schrieb Peter Langer <
peter.langer41@googlemail.com>:
There were some people on the Xilinx forums that had a similar problem
with kernel panics >>because their uboot device tree
configuration specified this:
memory@0{
device_type = "memory";
reg = <0x0 0x40000000>
}
Sorry that was a wrong clue, i thought reg meant it's measured in 4 byte
words.
I'm currently running a test with avahi-daemon disabled (1-day 21 hrs
running now).
Thumbs pressed.
Peter