 
                    Has anyone been able to run thunderbird under mochitest under valgrind?
I have been struggling to run TB under valgrind when executing mochitest
to check for memory errors.
(I used to this under |make mozmill| test framework, and I could fix
maybe a few dozen memory errors.)
Unfortunately, after switching to mochitest framework since December
2019, I have not been able to
run TB under valgrind satisfactorily at all.
Recent problem is that I see successful invocation of TB (I have to use
a replacement binary for TB, that pretends to be thunderbird binary, but
actually executes a saved TB binary under valgrind because there is an
argument passing issue to |mach| when valgrind is invoked using
|--valgrind| option.)
after upping the number of threads under valgrind to 5000, I get timeout.
Timeout IS expected because TB runs slowly under valgrind, but I am
not sure where the
timeout occurs exactly.
runtests.py seems to wait for "browser" and then times out.
From the log:
 0:03.53 INFO Application command:
/NEW-SSD/moz-obj-dir/objdir-tb3/dist/bin/thunderbird -marionette
-foreground -profile /COMM-CENTRAL/TMP-DIR/tmpbShgrj.mozrunner
 0:03.54 INFO runtests.py | Application pid: 6110
 0:03.54 Started process GECKO(6110)
 0:03.54 GECKO(6110) run-valgrind (masquerading as thunderbird binary)
 0:03.54 GECKO(6110) final command line is:
 0:03.55 GECKO(6110) valgrind --trace-children=yes --fair-sched=yes
--smc-check=all-non-file --gen-suppressions=all -v --trace-signals=yes
--vex-iropt-register-updates=allregs-at-mem-access --track-origins=yes
--child-silent-after-fork=yes
--trace-children-skip=/usr/bin/lsb_release,/usr/bin/hg,/bin/rm,/bin/certutil,/bin/pk12util,/bin/ssltunnel,/bin/uname,/bin/which,/bin/ps,/bin/grep,/bin/java,/fix-stacks,/firefox/firefox,/bin/firefox-esr,/bin/python,/bin/python2,/bin/python2.7,/bin/bash
--max-threads=5000  --max-stackframe=16000000 --num-transtab-sectors=24
--tool=memcheck --freelist-vol=500000000 --redzone-size=128
--px-default=allregs-at-mem-access
--px-file-backed=unwindregs-at-mem-access --malloc-fill=0xA5
--free-fill=0xC3 --num-callers=50
--suppressions=/home/ishikawa/Dropbox/myown.sup
--show-mismatched-frees=no --show-possibly-lost=no
--read-inline-info=yes
/KERNEL-SRC/moz-obj-dir/objdir-tb3/dist/bin/thunderbird-bin -marionette
-foreground -profile /COMM-CENTRAL/TMP-DIR/tmpbShgrj.mozrunner
 0:03.56 GECKO(6110) ==6113== Memcheck, a memory error detector
 0:03.56 GECKO(6110) ==6113== Copyright (C) 2002-2017, and GNU GPL'd,
by Julian Seward et al.
 0:03.56 GECKO(6110) ==6113== Using
Valgrind-3.16.0.GIT-c48fb614a1-20200507X and LibVEX; rerun with -h for
copyright info
 0:03.56 GECKO(6110) ==6113== Command:
/KERNEL-SRC/moz-obj-dir/objdir-tb3/dist/bin/thunderbird-bin -marionette
-foreground -profile /COMM-CENTRAL/TMP-DIR/tmpbShgrj.mozrunner
 0:03.56 GECKO(6110) ==6113==
 0:03.56 GECKO(6110) --6113-- Valgrind options:
 0:03.56 GECKO(6110) --6113--    --trace-children=yes
 0:03.56 GECKO(6110) --6113--    --fair-sched=yes
 0:03.56 GECKO(6110) --6113--    --smc-check=all-non-file
 0:03.56 GECKO(6110) --6113--    --gen-suppressions=all
 0:03.56 GECKO(6110) --6113--    -v
 0:03.56 GECKO(6110) --6113--    --trace-signals=yes
 0:03.56 GECKO(6110) --6113--
--vex-iropt-register-updates=allregs-at-mem-access
 0:03.56 GECKO(6110) --6113--    --track-origins=yes
 0:03.56 GECKO(6110) --6113--    --child-silent-after-fork=yes
 0:03.56 GECKO(6110) --6113--
--trace-children-skip=/usr/bin/lsb_release,/usr/bin/hg,/bin/rm,/bin/certutil,/bin/pk12util,/bin/ssltunnel,/bin/uname,/bin/which,/bin/ps,/bin/grep,/bin/java,/fix-stacks,/firefox/firefox,/bin/firefox-esr,/bin/python,/bin/python2,/bin/python2.7,/bin/bash
 0:03.56 GECKO(6110) --6113--    --max-threads=5000
 0:03.56 GECKO(6110) --6113--    --max-stackframe=16000000
 0:03.56 GECKO(6110) --6113--    --num-transtab-sectors=24
 0:03.56 GECKO(6110) --6113--    --tool=memcheck
 0:03.56 GECKO(6110) --6113--    --freelist-vol=500000000
 0:03.56 GECKO(6110) --6113--    --redzone-size=128
 0:03.56 GECKO(6110) --6113-- --px-default=allregs-at-mem-access
 0:03.56 GECKO(6110) --6113-- --px-file-backed=unwindregs-at-mem-access
 0:03.56 GECKO(6110) --6113--    --malloc-fill=0xA5
 0:03.56 GECKO(6110) --6113--    --free-fill=0xC3
 0:03.56 GECKO(6110) --6113--    --num-callers=50
 0:03.56 GECKO(6110) --6113--
--suppressions=/home/ishikawa/Dropbox/myown.sup
 0:03.56 GECKO(6110) --6113--    --show-mismatched-frees=no
 0:03.56 GECKO(6110) --6113--    --show-possibly-lost=no
 0:03.56 GECKO(6110) --6113--    --read-inline-info=yes
 0:03.56 GECKO(6110) --6113-- Contents of /proc/version:
 0:03.56 GECKO(6110) --6113--   Linux version 5.6.0-1-amd64
(debian-kernel@lists.debian.org) (gcc version 9.3.0 (Debian 9.3.0-11))
#1 SMP Debian 5.6.7-1 (2020-04-29)
 0:03.56 GECKO(6110) --6113--
 0:03.56 GECKO(6110) --6113-- Arch and hwcaps: AMD64, LittleEndian,
amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-rdrand
 0:03.56 GECKO(6110) --6113-- Page sizes: currently 4096, max supported
4096
 0:03.56 GECKO(6110) --6113-- Valgrind library directory:
/usr/local/lib/valgrind
The later on, I see the following line:
2:03.56 INFO runtests.py | Waiting for browser...
Eventually,
20:06.17 INFO TEST-UNEXPECTED-TIMEOUT | automation.py | application
timed out after 2960 seconds with no output
20:06.46 ERROR Force-terminating active process(es).
20:06.74 INFO Determining child pids from psutil...
20:11.00 INFO [6111]
20:12.27 INFO ==> process 6113 launched child process 6147
20:12.31 INFO Found child pids: set([6147, 6111])
20:13.01 INFO Failed to get child procs
20:13.02 INFO Killing process: 6147
20:13.38 INFO Failed to retrieve MOZ_UPLOAD_DIR env var
20:13.46 INFO Can't trigger Breakpad, just killing process
20:13.56 INFO Error: Failed to kill process 6147: psutil.NoSuchProcess
no process found with pid 6147
20:13.56 INFO Killing process: 6111
20:13.56 INFO Not taking screenshot here: see the one that was
previously logged
20:13.56 INFO Can't trigger Breakpad, just killing process
20:44.39 INFO psutil found pid 6111 dead
20:44.39 INFO failed to kill pid 6119 after 30s
20:44.39 INFO failed to kill pid 6113 after 30s
20:44.39 INFO Killing process: 6110
20:44.39 INFO Not taking screenshot here: see the one that was
previously logged
20:44.39 INFO Can't trigger Breakpad, just killing process
20:44.49 INFO psutil found pid 6110 dead
20:44.49 INFO psutil found pid 6110 dead
23:25.34 WARNING runtests.py | Failed to get app exit code -
running/crashed?
23:25.40 Main app process: exit 0
23:25.52 INFO runtests.py | Application ran for: 0:23:21.973599
23:25.55 INFO zombiecheck | Reading PID log:
/COMM-CENTRAL/TMP-DIR/tmpeA04Wqpidlog
23:25.85 INFO ==> process 6113 launched child process 6147
23:26.18 INFO zombiecheck | Checking for orphan process with PID: 6147
Traceback (most recent call last):
  File
"/NEW-SSD/moz-obj-dir/objdir-tb3/_tests/testing/mochitest/runtests.py",
line 2993, in doTests
    e10s=options.e10s
  File
"/NEW-SSD/moz-obj-dir/objdir-tb3/_tests/testing/mochitest/runtests.py",
line 2456, in runApp
    six.reraise(exc, value, tb)
  File
"/NEW-SSD/moz-obj-dir/objdir-tb3/_tests/testing/mochitest/runtests.py",
line 2369, in runApp
    self.marionette.start_session()
  File
"/NEW-SSD/NREF-COMM-CENTRAL/mozilla/testing/marionette/client/marionette_driver/decorators.py",
line 36, in _
    m._handle_socket_failure()
  File
"/NEW-SSD/NREF-COMM-CENTRAL/mozilla/testing/marionette/client/marionette_driver/marionette.py",
line 654, in _handle_socket_failure
    reraise(exc_cls, exc, tb)
  File
"/NEW-SSD/NREF-COMM-CENTRAL/mozilla/testing/marionette/client/marionette_driver/decorators.py",
line 26, in _
    return func(*args, **kwargs)
  File
"/NEW-SSD/NREF-COMM-CENTRAL/mozilla/testing/marionette/client/marionette_driver/marionette.py",
line 1112, in start_session
    self.raise_for_port(timeout=timeout)
  File
"/NEW-SSD/NREF-COMM-CENTRAL/mozilla/testing/marionette/client/marionette_driver/marionette.py",
line 573, in raise_for_port
    self.host, self.port))
timeout: Timed out waiting for connection on 127.0.0.1:2828!
23:42.19 ERROR Automation Error: Received unexpected exception while
running application
23:42.28 INFO Stopping web server
23:44.90 INFO Stopping web socket server
23:45.82 INFO Stopping ssltunnel
So nothing useful has been logged at all.
(Do I need to exclude a program that is supposed to service port 2828
from running under valgrind as in
"--trace-children-skip=/usr/bin/lsb_release,/usr/bin/hg,/bin/rm,/bin/certutil,/bin/pk12util,/bin/ssltunnel,/bin/uname,/bin/which,/bin/ps,/bin/grep,/bin/java,/fix-stacks,/firefox/firefox,/bin/firefox-esr,/bin/python,/bin/python2,/bin/python2.7,*/bin/bash"
???)
With |make mozmill| after lengthening  timeout value setting, TB under
valgrind ran the tests more or less successfully.
Any thought or personal experience with valgrind with TB with mochitest?
Chiaki
