maildev@lists.thunderbird.net

Thunderbird email developers

View all threads

Ideas for Addressing Long Standing Bug?

RS
Ryan Sipes
Fri, Jul 13, 2018 3:59 PM

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

--
Ryan Sipes
Community Manager
Thunderbird

Fellow Thunder Flock Members, A long standing bug has been brought to my attention regarding false positives in scam detection: https://bugzilla.mozilla.org/show_bug.cgi?id=623198 I wanted to post this bug here and see if anyone on this list had ideas on how to address this. Do you think we should keep scam detection enabled? If so, are there any ways to improve scam detection so that there are not so many false positives? Thoughts and feedback appreciated. -- Ryan Sipes Community Manager Thunderbird
JK
Jonathan Kamens
Fri, Jul 13, 2018 4:59 PM

I agree with the many commenters there who have argued that extremely
inaccurate scam detection is in fact worse than no scam detection at
all, and if indeed it is as inaccurate as people there are claiming it
is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the vast
majority of spam and phishing emails before I ever see them with a very
low false-positive rate, so the first thing I do whenever setting up a
new Thunderbird profile is disable the spam and scam detection engines
in Thunderbird. Therefore, I don't have any first-hand experience
speaking to whether the claims of inaccuracy in that ticket are accurate.

The thing about Wayne's "We should fix it rather than replace it"
argument (in comment 93
https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93) is that we
need to recognize the reality that the resources currently available for
substantive Thunderbird development are extraordinarily limited, so
unless this issue is bounced to the top of the priority list, it's not
going to get any developer work directed at it for a long time. Given
that, it really should be turned off if it's so inaccurate that it gives
users bad information more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100
comments to find out). I'm under the impression that Thunderbird's
Telemetry support allows the app to capture information about what users
do and transmit it anonymously to the developers to give them a better
idea about how the app is being used and guide future changes. A good
start would be to capture Telemetry support about how often TB marks a
message as a potential scam, how often users click the button telling TB
that a message in fact is not a scam, and how often users click the
button telling TB that a message which wasn't classified as such is a
scam (the latter statistic would need to be taken much less seriously
than the others, though, since the most likely outcome of a user
recognizing a scam message as such is to delete it). That Telemetry data
might give us a better idea of how accurate the scam detector is.

Regarding the question of how to build a better scam detector, as
comment 97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198 points
out, there is a lot of good research into how to build effective scam
detectors, so I would suggest the first step in any effort to build a
better scam detector into TB would be to consult the research and the
experts.

Or recognize that spam / scam detection is actually the job of the MTA,
not the MUA, and therefore we should get rid of all the spam and scam
detection functionality in TB and tell people who want it to switch to a
better mail service provider if theirs isn't doing a good job of filtering.

  jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

I agree with the many commenters there who have argued that extremely inaccurate scam detection is in fact worse than no scam detection at all, and if indeed it is as inaccurate as people there are claiming it is, it should be turned off by default. Personally, I use bogofilter on my mail server, which catches the vast majority of spam and phishing emails before I ever see them with a very low false-positive rate, so the first thing I do whenever setting up a new Thunderbird profile is disable the spam and scam detection engines in Thunderbird. Therefore, I don't have any first-hand experience speaking to whether the claims of inaccuracy in that ticket are accurate. The thing about Wayne's "We should fix it rather than replace it" argument (in comment 93 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is that we need to recognize the reality that the resources currently available for substantive Thunderbird development are extraordinarily limited, so unless this issue is bounced to the top of the priority list, it's not going to get any developer work directed at it for a long time. Given that, it really should be turned off if it's so inaccurate that it gives users bad information more often than good. I suppose there is a third alternative, which may or may have been suggested in that ticket (frankly, I'm not going to read over 100 comments to find out). I'm under the impression that Thunderbird's Telemetry support allows the app to capture information about what users do and transmit it anonymously to the developers to give them a better idea about how the app is being used and guide future changes. A good start would be to capture Telemetry support about how often TB marks a message as a potential scam, how often users click the button telling TB that a message in fact is not a scam, and how often users click the button telling TB that a message which wasn't classified as such is a scam (the latter statistic would need to be taken much less seriously than the others, though, since the most likely outcome of a user recognizing a scam message as such is to delete it). That Telemetry data might give us a better idea of how accurate the scam detector is. Regarding the question of how to build a better scam detector, as comment 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> points out, there is a lot of good research into how to build effective scam detectors, so I would suggest the first step in any effort to build a better scam detector into TB would be to consult the research and the experts. Or recognize that spam / scam detection is actually the job of the MTA, not the MUA, and therefore we should get rid of all the spam and scam detection functionality in TB and tell people who want it to switch to a better mail service provider if theirs isn't doing a good job of filtering.   jik On 7/13/18 11:59 AM, Ryan Sipes wrote: > Fellow Thunder Flock Members, > > A long standing bug has been brought to my attention regarding false > positives in scam detection: > https://bugzilla.mozilla.org/show_bug.cgi?id=623198 > > I wanted to post this bug here and see if anyone on this list had ideas > on how to address this. Do you think we should keep scam detection > enabled? If so, are there any ways to improve scam detection so that > there are not so many false positives? > > Thoughts and feedback appreciated. >
PK
Philipp Kewisch
Fri, Jul 13, 2018 6:35 PM

Hi Folks,

I do get this on a few messages from people I know, but all in all, is there is something that tells me a link does not have the same target it seems to have, that alone would be enough for me to have it enabled.

Both sides are purely anecdotal iiuc, so telemetry data would be great. Unfortunately I believe it is not set up correctly, so gathering data is a little more involved.

I am certain though we shouldn't just go turn it off because of a hunch.

Philipp

On 13. Jul 2018, at 6:59 PM, Jonathan Kamens jik@kamens.us wrote:

I agree with the many commenters there who have argued that extremely inaccurate scam detection is in fact worse than no scam detection at all, and if indeed it is as inaccurate as people there are claiming it is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the vast majority of spam and phishing emails before I ever see them with a very low false-positive rate, so the first thing I do      whenever setting up a new Thunderbird profile is disable the spam and scam detection engines in Thunderbird. Therefore, I don't have any first-hand experience speaking to whether the claims of inaccuracy in that ticket are accurate.
The thing about Wayne's "We should fix it rather than replace it" argument (in comment 93) is that we need to recognize the reality that the resources currently available for substantive Thunderbird development are extraordinarily limited, so unless this issue is bounced to the top of the priority list, it's not going to get any developer work directed at it for a long time. Given that, it really should be turned off if it's so inaccurate that it gives users bad information more often than good.

I suppose there is a third alternative, which may or may have been suggested in that ticket (frankly, I'm not going to read over 100 comments to find out). I'm under the impression that Thunderbird's Telemetry support allows the app to capture information about what users do and transmit it anonymously to the developers to give them a better idea about how the app is being used and guide future changes. A good start would be to capture Telemetry support about how often TB marks a message as a potential scam, how often users click the button telling TB that a message in fact is not a scam, and how often users click the button telling TB that a message which wasn't classified as such is a scam (the latter statistic would need to be taken much less seriously than the others, though, since the most likely outcome of a user recognizing a scam message as such is to delete it). That Telemetry data might give us a better idea of how accurate the scam detector is.

Regarding the question of how to build a better scam detector, as comment 97 points out, there is a lot of good research into how to build effective scam detectors, so I would suggest the first step in any effort to build a better scam detector into TB would be to consult the research and the experts.

Or recognize that spam / scam detection is actually the job of the MTA, not the MUA, and therefore we should get rid of all the spam and scam detection functionality in TB and tell people who want it to switch to a better mail service provider if theirs isn't doing a good job of filtering.

jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:
Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

Hi Folks, I do get this on a few messages from people I know, but all in all, is there is something that tells me a link does not have the same target it seems to have, that alone would be enough for me to have it enabled. Both sides are purely anecdotal iiuc, so telemetry data would be great. Unfortunately I believe it is not set up correctly, so gathering data is a little more involved. I am certain though we shouldn't just go turn it off because of a hunch. Philipp > On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us> wrote: > > I agree with the many commenters there who have argued that extremely inaccurate scam detection is in fact worse than no scam detection at all, and if indeed it is as inaccurate as people there are claiming it is, it should be turned off by default. > > Personally, I use bogofilter on my mail server, which catches the vast majority of spam and phishing emails before I ever see them with a very low false-positive rate, so the first thing I do whenever setting up a new Thunderbird profile is disable the spam and scam detection engines in Thunderbird. Therefore, I don't have any first-hand experience speaking to whether the claims of inaccuracy in that ticket are accurate. > The thing about Wayne's "We should fix it rather than replace it" argument (in comment 93) is that we need to recognize the reality that the resources currently available for substantive Thunderbird development are extraordinarily limited, so unless this issue is bounced to the top of the priority list, it's not going to get any developer work directed at it for a long time. Given that, it really should be turned off if it's so inaccurate that it gives users bad information more often than good. > > I suppose there is a third alternative, which may or may have been suggested in that ticket (frankly, I'm not going to read over 100 comments to find out). I'm under the impression that Thunderbird's Telemetry support allows the app to capture information about what users do and transmit it anonymously to the developers to give them a better idea about how the app is being used and guide future changes. A good start would be to capture Telemetry support about how often TB marks a message as a potential scam, how often users click the button telling TB that a message in fact is not a scam, and how often users click the button telling TB that a message which wasn't classified as such is a scam (the latter statistic would need to be taken much less seriously than the others, though, since the most likely outcome of a user recognizing a scam message as such is to delete it). That Telemetry data might give us a better idea of how accurate the scam detector is. > > Regarding the question of how to build a better scam detector, as comment 97 points out, there is a lot of good research into how to build effective scam detectors, so I would suggest the first step in any effort to build a better scam detector into TB would be to consult the research and the experts. > > Or recognize that spam / scam detection is actually the job of the MTA, not the MUA, and therefore we should get rid of all the spam and scam detection functionality in TB and tell people who want it to switch to a better mail service provider if theirs isn't doing a good job of filtering. > > jik >> On 7/13/18 11:59 AM, Ryan Sipes wrote: >> Fellow Thunder Flock Members, >> >> A long standing bug has been brought to my attention regarding false >> positives in scam detection: >> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >> >> I wanted to post this bug here and see if anyone on this list had ideas >> on how to address this. Do you think we should keep scam detection >> enabled? If so, are there any ways to improve scam detection so that >> there are not so many false positives? >> >> Thoughts and feedback appreciated. >> > _______________________________________________ > Maildev mailing list > Maildev@lists.thunderbird.net > http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net
JB
Josiah Bruner
Fri, Jul 13, 2018 7:18 PM

Hi all,

Regarding potentially disabling:

I absolutely agree with Philipp and (indirectly) Jonathan that we need
actual data in order to know that the current false positive rate "harms
security" and therefore should be disabled. This was my main issue with
the original proposal.

Telemetry is likely inadequate and it looks like we really need some UX
research done instead. An experimental setup might look like:


Given:

  • A known phishing email "NoDetect" that TB's scam detector does not
    classify as a scam and where the malicious link in the email is
    controlled by the experimenter.

  • A similar phishing email "Detect" that TB's scam detector DOES
    classify as a scam and where the malicious link in the email is
    controlled by the experimenter.

  • A population of TB users USERS who have scam detection enabled and
    have used TB for some considerable amount of time.

Experiment:

  • Send a random half of USERS "NoDetect" and note the percentage
    (NoDetectHitRate) of users who click on the link.
  • Send the disjoint other half of USERS "Detect" and note the percentage
    (DetectHitRate) of users who click on the link.

Question:

Is NoDetectHitRate < DetectHitRate ?

If so, we can conclude that users are safer without scam detection enabled.

If not, we can conclude that scam detection does prevent more clicks
than causes.

However, I'm not sure how to execute this experiment accurately and
ethically (it would be easy to mass send these emails and track
responses). Maybe some UX researchers do?

Regarding improving the detection rate:

As Jonathan noted, in comment 97 I provide evidence that we likely can
improve detection.

In comment 89 I propose one potential method to do this.

Critically, in order to really improve the detector and KNOW that we are
improving it, we need test data to evaluate on. In comment 89 I mention
this and note that the APWG has such data. If Mozilla is part of this
working group (which seems likely to me), we may be able to obtain what
we need to perform work.

For the record, although I'm a volunteer, it is this phishing detection
algorithm where I intend on spending my efforts. If we can get data, I
will happily do the work needed to improve the logic.

// Josiah Bruner

On 7/13/18 2:35 PM, Philipp Kewisch wrote:

Hi Folks,

I do get this on a few messages from people I know, but all in all, is
there is something that tells me a link does not have the same target it
seems to have, that alone would be enough for me to have it enabled.

Both sides are purely anecdotal iiuc, so telemetry data would be great.
Unfortunately I believe it is not set up correctly, so gathering data is
a little more involved.

I am certain though we shouldn't just go turn it off because of a hunch.

Philipp

On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us
mailto:jik@kamens.us> wrote:

I agree with the many commenters there who have argued that extremely
inaccurate scam detection is in fact worse than no scam detection at
all, and if indeed it is as inaccurate as people there are claiming it
is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the vast
majority of spam and phishing emails before I ever see them with a
very low false-positive rate, so the first thing I do whenever setting
up a new Thunderbird profile is disable the spam and scam detection
engines in Thunderbird. Therefore, I don't have any first-hand
experience speaking to whether the claims of inaccuracy in that ticket
are accurate.

The thing about Wayne's "We should fix it rather than replace it"
argument (in comment 93
https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93) is that we
need to recognize the reality that the resources currently available
for substantive Thunderbird development are extraordinarily limited,
so unless this issue is bounced to the top of the priority list, it's
not going to get any developer work directed at it for a long time.
Given that, it really should be turned off if it's so inaccurate that
it gives users bad information more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100
comments to find out). I'm under the impression that Thunderbird's
Telemetry support allows the app to capture information about what
users do and transmit it anonymously to the developers to give them a
better idea about how the app is being used and guide future changes.
A good start would be to capture Telemetry support about how often TB
marks a message as a potential scam, how often users click the button
telling TB that a message in fact is not a scam, and how often users
click the button telling TB that a message which wasn't classified as
such is a scam (the latter statistic would need to be taken much less
seriously than the others, though, since the most likely outcome of a
user recognizing a scam message as such is to delete it). That
Telemetry data might give us a better idea of how accurate the scam
detector is.

Regarding the question of how to build a better scam detector, as
comment 97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198
points out, there is a lot of good research into how to build
effective scam detectors, so I would suggest the first step in any
effort to build a better scam detector into TB would be to consult the
research and the experts.

Or recognize that spam / scam detection is actually the job of the
MTA, not the MUA, and therefore we should get rid of all the spam and
scam detection functionality in TB and tell people who want it to
switch to a better mail service provider if theirs isn't doing a good
job of filtering.

  jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

Hi all, Regarding potentially disabling: I absolutely agree with Philipp and (indirectly) Jonathan that we need actual data in order to know that the current false positive rate "harms security" and therefore should be disabled. This was my main issue with the original proposal. Telemetry is likely inadequate and it looks like we really need some UX research done instead. An experimental setup might look like: -------------------- Given: - A known phishing email "NoDetect" that TB's scam detector does not classify as a scam and where the malicious link in the email is controlled by the experimenter. - A similar phishing email "Detect" that TB's scam detector DOES classify as a scam and where the malicious link in the email is controlled by the experimenter. - A population of TB users USERS who have scam detection enabled and have used TB for some considerable amount of time. Experiment: - Send a random half of USERS "NoDetect" and note the percentage (NoDetectHitRate) of users who click on the link. - Send the disjoint other half of USERS "Detect" and note the percentage (DetectHitRate) of users who click on the link. Question: Is NoDetectHitRate < DetectHitRate ? If so, we can conclude that users are safer without scam detection enabled. If not, we can conclude that scam detection does prevent more clicks than causes. -------------------- However, I'm not sure how to execute this experiment accurately and *ethically* (it would be easy to mass send these emails and track responses). Maybe some UX researchers do? Regarding improving the detection rate: As Jonathan noted, in comment 97 I provide evidence that we likely *can* improve detection. In comment 89 I propose one potential method to do this. Critically, in order to really improve the detector and KNOW that we are improving it, we need test data to evaluate on. In comment 89 I mention this and note that the APWG has such data. If Mozilla is part of this working group (which seems likely to me), we may be able to obtain what we need to perform work. For the record, although I'm a volunteer, it is this phishing detection algorithm where I intend on spending my efforts. If we can get data, I will happily do the work needed to improve the logic. // Josiah Bruner On 7/13/18 2:35 PM, Philipp Kewisch wrote: > Hi Folks, > > I do get this on a few messages from people I know, but all in all, is > there is something that tells me a link does not have the same target it > seems to have, that alone would be enough for me to have it enabled. > > Both sides are purely anecdotal iiuc, so telemetry data would be great. > Unfortunately I believe it is not set up correctly, so gathering data is > a little more involved. > > I am certain though we shouldn't just go turn it off because of a hunch. > > Philipp > > On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us > <mailto:jik@kamens.us>> wrote: > >> I agree with the many commenters there who have argued that extremely >> inaccurate scam detection is in fact worse than no scam detection at >> all, and if indeed it is as inaccurate as people there are claiming it >> is, it should be turned off by default. >> >> Personally, I use bogofilter on my mail server, which catches the vast >> majority of spam and phishing emails before I ever see them with a >> very low false-positive rate, so the first thing I do whenever setting >> up a new Thunderbird profile is disable the spam and scam detection >> engines in Thunderbird. Therefore, I don't have any first-hand >> experience speaking to whether the claims of inaccuracy in that ticket >> are accurate. >> >> The thing about Wayne's "We should fix it rather than replace it" >> argument (in comment 93 >> <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is that we >> need to recognize the reality that the resources currently available >> for substantive Thunderbird development are extraordinarily limited, >> so unless this issue is bounced to the top of the priority list, it's >> not going to get any developer work directed at it for a long time. >> Given that, it really should be turned off if it's so inaccurate that >> it gives users bad information more often than good. >> >> I suppose there is a third alternative, which may or may have been >> suggested in that ticket (frankly, I'm not going to read over 100 >> comments to find out). I'm under the impression that Thunderbird's >> Telemetry support allows the app to capture information about what >> users do and transmit it anonymously to the developers to give them a >> better idea about how the app is being used and guide future changes. >> A good start would be to capture Telemetry support about how often TB >> marks a message as a potential scam, how often users click the button >> telling TB that a message in fact is not a scam, and how often users >> click the button telling TB that a message which wasn't classified as >> such is a scam (the latter statistic would need to be taken much less >> seriously than the others, though, since the most likely outcome of a >> user recognizing a scam message as such is to delete it). That >> Telemetry data might give us a better idea of how accurate the scam >> detector is. >> >> Regarding the question of how to build a better scam detector, as >> comment 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> >> points out, there is a lot of good research into how to build >> effective scam detectors, so I would suggest the first step in any >> effort to build a better scam detector into TB would be to consult the >> research and the experts. >> >> Or recognize that spam / scam detection is actually the job of the >> MTA, not the MUA, and therefore we should get rid of all the spam and >> scam detection functionality in TB and tell people who want it to >> switch to a better mail service provider if theirs isn't doing a good >> job of filtering. >> >>   jik >> >> On 7/13/18 11:59 AM, Ryan Sipes wrote: >>> Fellow Thunder Flock Members, >>> >>> A long standing bug has been brought to my attention regarding false >>> positives in scam detection: >>> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >>> >>> I wanted to post this bug here and see if anyone on this list had ideas >>> on how to address this. Do you think we should keep scam detection >>> enabled? If so, are there any ways to improve scam detection so that >>> there are not so many false positives? >>> >>> Thoughts and feedback appreciated. >>> >> _______________________________________________ >> Maildev mailing list >> Maildev@lists.thunderbird.net <mailto:Maildev@lists.thunderbird.net> >> http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net > > _______________________________________________ > Maildev mailing list > Maildev@lists.thunderbird.net > http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net >
BB
Ben Bucksch
Fri, Jul 13, 2018 7:40 PM

As for spam filtering, i use spam filters both on server and on client level. the thunderbird spam filter is very good with very few false positives. it keeps a lot of spam away from me and works better than ISP level spam filters, because it's trained on my email. however, to start, i had trained it large bodies of ham (i marked existing mail folders s "not spam"), and marked undetected spam as spam, so after a while, it became very good.

the scam filter rules are static and very primitive. e.g. it considers all links with IP addresses as scam. obviously too simplistic.

i would also suggest to improve it rather than remove it. the UI is a large part of the work and can stay. somebody could improve the filter rules, it should not be too difficult, if based on exisiting research. i agree we should profit from it. revolving the scam filter is a waste.

i don't think telemetry can give us useful information, because - as you correctly stated -, there is no point for an end user to mark something as scam, and having only one half of the data does not help.

this would be a nice first contribution for an advanced and ambitious student.

Ben

Am 13. Juli 2018 18:59:55 MESZ schrieb Jonathan Kamens jik@kamens.us:

I agree with the many commenters there who have argued that extremely
inaccurate scam detection is in fact worse than no scam detection at
all, and if indeed it is as inaccurate as people there are claiming it
is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the vast
majority of spam and phishing emails before I ever see them with a very

low false-positive rate, so the first thing I do whenever setting up a
new Thunderbird profile is disable the spam and scam detection engines
in Thunderbird. Therefore, I don't have any first-hand experience
speaking to whether the claims of inaccuracy in that ticket are
accurate.

The thing about Wayne's "We should fix it rather than replace it"
argument (in comment 93
https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93) is that we
need to recognize the reality that the resources currently available
for
substantive Thunderbird development are extraordinarily limited, so
unless this issue is bounced to the top of the priority list, it's not
going to get any developer work directed at it for a long time. Given
that, it really should be turned off if it's so inaccurate that it
gives
users bad information more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100
comments to find out). I'm under the impression that Thunderbird's
Telemetry support allows the app to capture information about what
users
do and transmit it anonymously to the developers to give them a better
idea about how the app is being used and guide future changes. A good
start would be to capture Telemetry support about how often TB marks a
message as a potential scam, how often users click the button telling
TB
that a message in fact is not a scam, and how often users click the
button telling TB that a message which wasn't classified as such is a
scam (the latter statistic would need to be taken much less seriously
than the others, though, since the most likely outcome of a user
recognizing a scam message as such is to delete it). That Telemetry
data
might give us a better idea of how accurate the scam detector is.

Regarding the question of how to build a better scam detector, as
comment 97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198 points

out, there is a lot of good research into how to build effective scam
detectors, so I would suggest the first step in any effort to build a
better scam detector into TB would be to consult the research and the
experts.

Or recognize that spam / scam detection is actually the job of the MTA,

not the MUA, and therefore we should get rid of all the spam and scam
detection functionality in TB and tell people who want it to switch to
a
better mail service provider if theirs isn't doing a good job of
filtering.

  jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had

ideas

on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

--
Sent from my phone. Please excuse the brevity.

As for spam filtering, i use spam filters both on server and on client level. the thunderbird spam filter is very good with very few false positives. it keeps a lot of spam away from me and works better than ISP level spam filters, because it's trained on my email. however, to start, i had trained it large bodies of ham (i marked existing mail folders s "not spam"), and marked undetected spam as spam, so after a while, it became very good. the scam filter rules are static and very primitive. e.g. it considers all links with IP addresses as scam. obviously too simplistic. i would also suggest to improve it rather than remove it. the UI is a large part of the work and can stay. somebody could improve the filter rules, it should not be too difficult, if based on exisiting research. i agree we should profit from it. revolving the scam filter is a waste. i don't think telemetry can give us useful information, because - as you correctly stated -, there is no point for an end user to mark something as scam, and having only one half of the data does not help. this would be a nice first contribution for an advanced and ambitious student. Ben Am 13. Juli 2018 18:59:55 MESZ schrieb Jonathan Kamens <jik@kamens.us>: >I agree with the many commenters there who have argued that extremely >inaccurate scam detection is in fact worse than no scam detection at >all, and if indeed it is as inaccurate as people there are claiming it >is, it should be turned off by default. > >Personally, I use bogofilter on my mail server, which catches the vast >majority of spam and phishing emails before I ever see them with a very > >low false-positive rate, so the first thing I do whenever setting up a >new Thunderbird profile is disable the spam and scam detection engines >in Thunderbird. Therefore, I don't have any first-hand experience >speaking to whether the claims of inaccuracy in that ticket are >accurate. > >The thing about Wayne's "We should fix it rather than replace it" >argument (in comment 93 ><https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is that we >need to recognize the reality that the resources currently available >for >substantive Thunderbird development are extraordinarily limited, so >unless this issue is bounced to the top of the priority list, it's not >going to get any developer work directed at it for a long time. Given >that, it really should be turned off if it's so inaccurate that it >gives >users bad information more often than good. > >I suppose there is a third alternative, which may or may have been >suggested in that ticket (frankly, I'm not going to read over 100 >comments to find out). I'm under the impression that Thunderbird's >Telemetry support allows the app to capture information about what >users >do and transmit it anonymously to the developers to give them a better >idea about how the app is being used and guide future changes. A good >start would be to capture Telemetry support about how often TB marks a >message as a potential scam, how often users click the button telling >TB >that a message in fact is not a scam, and how often users click the >button telling TB that a message which wasn't classified as such is a >scam (the latter statistic would need to be taken much less seriously >than the others, though, since the most likely outcome of a user >recognizing a scam message as such is to delete it). That Telemetry >data >might give us a better idea of how accurate the scam detector is. > >Regarding the question of how to build a better scam detector, as >comment 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> points > >out, there is a lot of good research into how to build effective scam >detectors, so I would suggest the first step in any effort to build a >better scam detector into TB would be to consult the research and the >experts. > >Or recognize that spam / scam detection is actually the job of the MTA, > >not the MUA, and therefore we should get rid of all the spam and scam >detection functionality in TB and tell people who want it to switch to >a >better mail service provider if theirs isn't doing a good job of >filtering. > >   jik > >On 7/13/18 11:59 AM, Ryan Sipes wrote: >> Fellow Thunder Flock Members, >> >> A long standing bug has been brought to my attention regarding false >> positives in scam detection: >> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >> >> I wanted to post this bug here and see if anyone on this list had >ideas >> on how to address this. Do you think we should keep scam detection >> enabled? If so, are there any ways to improve scam detection so that >> there are not so many false positives? >> >> Thoughts and feedback appreciated. >> -- Sent from my phone. Please excuse the brevity.
MM
Magnus Melin
Fri, Jul 13, 2018 7:41 PM

On 13-07-2018 22:18, Josiah Bruner wrote:

I absolutely agree with Philipp and (indirectly) Jonathan that we need
actual data in order to know that the current false positive rate
"harms security" and therefore should be disabled.

I think we're conflating two issues here. The current case where scam
warning that's triggered by email trackers should be reworked, and
that's a fairly small change. An easy win.After that, there is of course
more elaborate improvements that can be done.

There are different kinds of scams, but maybe checking with the safe
browsing data (from Firefox) before following links would catch a lot of
them.

 -Magnus

On 13-07-2018 22:18, Josiah Bruner wrote: > I absolutely agree with Philipp and (indirectly) Jonathan that we need > actual data in order to know that the current false positive rate > "harms security" and therefore should be disabled. I think we're conflating two issues here. The current case where scam warning that's triggered by email trackers should be reworked, and that's a fairly small change. An easy win.After that, there is of course more elaborate improvements that can be done. There are different kinds of scams, but maybe checking with the safe browsing data (from Firefox) before following links would catch a lot of them.  -Magnus
JK
Jonathan Kamens
Sat, Jul 14, 2018 5:17 PM

OK, so if you want to dismiss the experiences of people commenting on
the bug as anecdotal, then here's some hard data to consider.

I save all of my spam / scam messages and legitimate email messages
going back for several months so that I can retrain my Bayesian spam
filter when it proves necessary to do so. I was therefore able to
conduct the following experiment.

I scanned 4,703 spam / scam messages I received between April 13 and
July 13 (92 days) to find out which of them Thunderbird would flag as a
scam. It flagged 38 of them, a success rate of 0.8%. Not all of the
messages in question were ones that I would expect a scam detector to
flag, but most of them are, so I would certainly not consider a scam
detector that only flags 0.8% of my spam messages as scams as
particularly useful or accurate.

More significantly, I also scanned 12,860 legitimate email messages
which I received between February 13 and July 13 (151 days) to find out
which of them Thunderbird would flag falsely as a scam. It flagged 105
of them, a rate of 0.82%. The difference of only 0.02% between the false
positive rate and the true positive rate seems to imply that it's pretty
much a crap-shoot whether the scam detector is accurate for any
particular message.

Note that most of the known spam / scam messages which I fed through the
scam detector were detected and filtered out by my spam filter when I
first received them, i.e., I never actually saw them in my inbox. If a
user is using any halfway decent spam detector, then they will have a
similar experience. That means that they are likely to see far more
false positives than true positives from the scam detector.

In another message in this thread, Josiah proposed a UX experiment to
determine whether a a lot of false positives from the scam detector are
influencing people to ignore its warnings. With all due respect to
Josiah, we don't need to do that research. The research has already been
done. It's well-established in the cybersecurity field (about which I
can speak with some conviction, seeing as how I've been working in
cybersecurity for nearly 30 years and I'm currently a CISO) that alert
fatigue causes alerts to be ignored. Furthermore, alert fatigue bleeds
between applications, which means that by generating a lot of false
positives in Thunderbird, we're making it more likely that people will
ignore real alerts from other applications. This is not good for anyone.

Frankly, if Thunderbird's spam filter is halfway decent (and I don't
know whether it is, because as I said previously, I use bogofilter), it
is probably going to do a far//better job of filtering out scams than
Thunderbird's current scam detector, and given the accuracy of the
detector as illustrated above, I'd advocate ditching it entirely and
sticking with just the spam filter.

So, I've now presented hard data illustrating just how bad the scam
detector's performance is. Are we still going to insist that we
shouldn't take any action because we only have anecdotal data to work with?

  Jonathan Kamens

On 7/13/18 2:35 PM, Philipp Kewisch wrote:

Hi Folks,

I do get this on a few messages from people I know, but all in all, is
there is something that tells me a link does not have the same target
it seems to have, that alone would be enough for me to have it enabled.

Both sides are purely anecdotal iiuc, so telemetry data would be
great. Unfortunately I believe it is not set up correctly, so
gathering data is a little more involved.

I am certain though we shouldn't just go turn it off because of a hunch.

Philipp

On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us
mailto:jik@kamens.us> wrote:

I agree with the many commenters there who have argued that extremely
inaccurate scam detection is in fact worse than no scam detection at
all, and if indeed it is as inaccurate as people there are claiming
it is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the
vast majority of spam and phishing emails before I ever see them with
a very low false-positive rate, so the first thing I do whenever
setting up a new Thunderbird profile is disable the spam and scam
detection engines in Thunderbird. Therefore, I don't have any
first-hand experience speaking to whether the claims of inaccuracy in
that ticket are accurate.

The thing about Wayne's "We should fix it rather than replace it"
argument (in comment 93
https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93) is that we
need to recognize the reality that the resources currently available
for substantive Thunderbird development are extraordinarily limited,
so unless this issue is bounced to the top of the priority list, it's
not going to get any developer work directed at it for a long time.
Given that, it really should be turned off if it's so inaccurate that
it gives users bad information more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100
comments to find out). I'm under the impression that Thunderbird's
Telemetry support allows the app to capture information about what
users do and transmit it anonymously to the developers to give them a
better idea about how the app is being used and guide future changes.
A good start would be to capture Telemetry support about how often TB
marks a message as a potential scam, how often users click the button
telling TB that a message in fact is not a scam, and how often users
click the button telling TB that a message which wasn't classified as
such is a scam (the latter statistic would need to be taken much less
seriously than the others, though, since the most likely outcome of a
user recognizing a scam message as such is to delete it). That
Telemetry data might give us a better idea of how accurate the scam
detector is.

Regarding the question of how to build a better scam detector, as
comment 97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198
points out, there is a lot of good research into how to build
effective scam detectors, so I would suggest the first step in any
effort to build a better scam detector into TB would be to consult
the research and the experts.

Or recognize that spam / scam detection is actually the job of the
MTA, not the MUA, and therefore we should get rid of all the spam and
scam detection functionality in TB and tell people who want it to
switch to a better mail service provider if theirs isn't doing a good
job of filtering.

  jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

OK, so if you want to dismiss the experiences of people commenting on the bug as anecdotal, then here's some hard data to consider. I save all of my spam / scam messages and legitimate email messages going back for several months so that I can retrain my Bayesian spam filter when it proves necessary to do so. I was therefore able to conduct the following experiment. I scanned 4,703 spam / scam messages I received between April 13 and July 13 (92 days) to find out which of them Thunderbird would flag as a scam. It flagged 38 of them, a success rate of 0.8%. Not all of the messages in question were ones that I would expect a scam detector to flag, but most of them are, so I would certainly not consider a scam detector that only flags 0.8% of my spam messages as scams as particularly useful or accurate. More significantly, I also scanned 12,860 legitimate email messages which I received between February 13 and July 13 (151 days) to find out which of them Thunderbird would flag falsely as a scam. It flagged 105 of them, a rate of 0.82%. The difference of only 0.02% between the false positive rate and the true positive rate seems to imply that it's pretty much a crap-shoot whether the scam detector is accurate for any particular message. Note that most of the known spam / scam messages which I fed through the scam detector were detected and filtered out by my spam filter when I first received them, i.e., I never actually saw them in my inbox. If a user is using any halfway decent spam detector, then they will have a similar experience. That means that they are likely to see far more false positives than true positives from the scam detector. In another message in this thread, Josiah proposed a UX experiment to determine whether a a lot of false positives from the scam detector are influencing people to ignore its warnings. With all due respect to Josiah, we don't need to do that research. The research has already been done. It's well-established in the cybersecurity field (about which I can speak with some conviction, seeing as how I've been working in cybersecurity for nearly 30 years and I'm currently a CISO) that alert fatigue causes alerts to be ignored. Furthermore, alert fatigue bleeds between applications, which means that by generating a lot of false positives in Thunderbird, we're making it more likely that people will ignore real alerts from other applications. This is not good for anyone. Frankly, if Thunderbird's spam filter is halfway decent (and I don't know whether it is, because as I said previously, I use bogofilter), it is probably going to do a far//better job of filtering out scams than Thunderbird's current scam detector, and given the accuracy of the detector as illustrated above, I'd advocate ditching it entirely and sticking with just the spam filter. So, I've now presented hard data illustrating just how bad the scam detector's performance is. Are we still going to insist that we shouldn't take any action because we only have anecdotal data to work with?   Jonathan Kamens On 7/13/18 2:35 PM, Philipp Kewisch wrote: > Hi Folks, > > I do get this on a few messages from people I know, but all in all, is > there is something that tells me a link does not have the same target > it seems to have, that alone would be enough for me to have it enabled. > > Both sides are purely anecdotal iiuc, so telemetry data would be > great. Unfortunately I believe it is not set up correctly, so > gathering data is a little more involved. > > I am certain though we shouldn't just go turn it off because of a hunch. > > Philipp > > On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us > <mailto:jik@kamens.us>> wrote: > >> I agree with the many commenters there who have argued that extremely >> inaccurate scam detection is in fact worse than no scam detection at >> all, and if indeed it is as inaccurate as people there are claiming >> it is, it should be turned off by default. >> >> Personally, I use bogofilter on my mail server, which catches the >> vast majority of spam and phishing emails before I ever see them with >> a very low false-positive rate, so the first thing I do whenever >> setting up a new Thunderbird profile is disable the spam and scam >> detection engines in Thunderbird. Therefore, I don't have any >> first-hand experience speaking to whether the claims of inaccuracy in >> that ticket are accurate. >> >> The thing about Wayne's "We should fix it rather than replace it" >> argument (in comment 93 >> <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is that we >> need to recognize the reality that the resources currently available >> for substantive Thunderbird development are extraordinarily limited, >> so unless this issue is bounced to the top of the priority list, it's >> not going to get any developer work directed at it for a long time. >> Given that, it really should be turned off if it's so inaccurate that >> it gives users bad information more often than good. >> >> I suppose there is a third alternative, which may or may have been >> suggested in that ticket (frankly, I'm not going to read over 100 >> comments to find out). I'm under the impression that Thunderbird's >> Telemetry support allows the app to capture information about what >> users do and transmit it anonymously to the developers to give them a >> better idea about how the app is being used and guide future changes. >> A good start would be to capture Telemetry support about how often TB >> marks a message as a potential scam, how often users click the button >> telling TB that a message in fact is not a scam, and how often users >> click the button telling TB that a message which wasn't classified as >> such is a scam (the latter statistic would need to be taken much less >> seriously than the others, though, since the most likely outcome of a >> user recognizing a scam message as such is to delete it). That >> Telemetry data might give us a better idea of how accurate the scam >> detector is. >> >> Regarding the question of how to build a better scam detector, as >> comment 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> >> points out, there is a lot of good research into how to build >> effective scam detectors, so I would suggest the first step in any >> effort to build a better scam detector into TB would be to consult >> the research and the experts. >> >> Or recognize that spam / scam detection is actually the job of the >> MTA, not the MUA, and therefore we should get rid of all the spam and >> scam detection functionality in TB and tell people who want it to >> switch to a better mail service provider if theirs isn't doing a good >> job of filtering. >> >>   jik >> >> On 7/13/18 11:59 AM, Ryan Sipes wrote: >>> Fellow Thunder Flock Members, >>> >>> A long standing bug has been brought to my attention regarding false >>> positives in scam detection: >>> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >>> >>> I wanted to post this bug here and see if anyone on this list had ideas >>> on how to address this. Do you think we should keep scam detection >>> enabled? If so, are there any ways to improve scam detection so that >>> there are not so many false positives? >>> >>> Thoughts and feedback appreciated. >>> >> _______________________________________________ >> Maildev mailing list >> Maildev@lists.thunderbird.net <mailto:Maildev@lists.thunderbird.net> >> http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net
JK
Jonathan Kamens
Sat, Jul 14, 2018 5:20 PM

Oh, I forgot one more thing that I wanted to point out. The other risk
of a scam detector with a low success rate and a high false positive
rate is that it gives people a false sense of security. They see the
scam detector triggering frequently enough that they know it's there and
think that a high false positive rate must also mean that it does a good
job of detecting real scams, and that makes them /more/ likely to click
on the link when they receive a scam that the detector doesn't flag.

In short, the scam detector really is doing more harm than good in its
current form.

  jik

On 7/14/18 1:17 PM, Jonathan Kamens wrote:

OK, so if you want to dismiss the experiences of people commenting on
the bug as anecdotal, then here's some hard data to consider.

I save all of my spam / scam messages and legitimate email messages
going back for several months so that I can retrain my Bayesian spam
filter when it proves necessary to do so. I was therefore able to
conduct the following experiment.

I scanned 4,703 spam / scam messages I received between April 13 and
July 13 (92 days) to find out which of them Thunderbird would flag as
a scam. It flagged 38 of them, a success rate of 0.8%. Not all of the
messages in question were ones that I would expect a scam detector to
flag, but most of them are, so I would certainly not consider a scam
detector that only flags 0.8% of my spam messages as scams as
particularly useful or accurate.

More significantly, I also scanned 12,860 legitimate email messages
which I received between February 13 and July 13 (151 days) to find
out which of them Thunderbird would flag falsely as a scam. It flagged
105 of them, a rate of 0.82%. The difference of only 0.02% between the
false positive rate and the true positive rate seems to imply that
it's pretty much a crap-shoot whether the scam detector is accurate
for any particular message.

Note that most of the known spam / scam messages which I fed through
the scam detector were detected and filtered out by my spam filter
when I first received them, i.e., I never actually saw them in my
inbox. If a user is using any halfway decent spam detector, then they
will have a similar experience. That means that they are likely to see
far more false positives than true positives from the scam detector.

In another message in this thread, Josiah proposed a UX experiment to
determine whether a a lot of false positives from the scam detector
are influencing people to ignore its warnings. With all due respect to
Josiah, we don't need to do that research. The research has already
been done. It's well-established in the cybersecurity field (about
which I can speak with some conviction, seeing as how I've been
working in cybersecurity for nearly 30 years and I'm currently a CISO)
that alert fatigue causes alerts to be ignored. Furthermore, alert
fatigue bleeds between applications, which means that by generating a
lot of false positives in Thunderbird, we're making it more likely
that people will ignore real alerts from other applications. This is
not good for anyone.

Frankly, if Thunderbird's spam filter is halfway decent (and I don't
know whether it is, because as I said previously, I use bogofilter),
it is probably going to do a far//better job of filtering out scams
than Thunderbird's current scam detector, and given the accuracy of
the detector as illustrated above, I'd advocate ditching it entirely
and sticking with just the spam filter.

So, I've now presented hard data illustrating just how bad the scam
detector's performance is. Are we still going to insist that we
shouldn't take any action because we only have anecdotal data to work
with?

  Jonathan Kamens

On 7/13/18 2:35 PM, Philipp Kewisch wrote:

Hi Folks,

I do get this on a few messages from people I know, but all in all,
is there is something that tells me a link does not have the same
target it seems to have, that alone would be enough for me to have it
enabled.

Both sides are purely anecdotal iiuc, so telemetry data would be
great. Unfortunately I believe it is not set up correctly, so
gathering data is a little more involved.

I am certain though we shouldn't just go turn it off because of a hunch.

Philipp

On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us
mailto:jik@kamens.us> wrote:

I agree with the many commenters there who have argued that
extremely inaccurate scam detection is in fact worse than no scam
detection at all, and if indeed it is as inaccurate as people there
are claiming it is, it should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the
vast majority of spam and phishing emails before I ever see them
with a very low false-positive rate, so the first thing I do
whenever setting up a new Thunderbird profile is disable the spam
and scam detection engines in Thunderbird. Therefore, I don't have
any first-hand experience speaking to whether the claims of
inaccuracy in that ticket are accurate.

The thing about Wayne's "We should fix it rather than replace it"
argument (in comment 93
https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93) is that
we need to recognize the reality that the resources currently
available for substantive Thunderbird development are
extraordinarily limited, so unless this issue is bounced to the top
of the priority list, it's not going to get any developer work
directed at it for a long time. Given that, it really should be
turned off if it's so inaccurate that it gives users bad information
more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100
comments to find out). I'm under the impression that Thunderbird's
Telemetry support allows the app to capture information about what
users do and transmit it anonymously to the developers to give them
a better idea about how the app is being used and guide future
changes. A good start would be to capture Telemetry support about
how often TB marks a message as a potential scam, how often users
click the button telling TB that a message in fact is not a scam,
and how often users click the button telling TB that a message which
wasn't classified as such is a scam (the latter statistic would need
to be taken much less seriously than the others, though, since the
most likely outcome of a user recognizing a scam message as such is
to delete it). That Telemetry data might give us a better idea of
how accurate the scam detector is.

Regarding the question of how to build a better scam detector, as
comment 97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198
points out, there is a lot of good research into how to build
effective scam detectors, so I would suggest the first step in any
effort to build a better scam detector into TB would be to consult
the research and the experts.

Or recognize that spam / scam detection is actually the job of the
MTA, not the MUA, and therefore we should get rid of all the spam
and scam detection functionality in TB and tell people who want it
to switch to a better mail service provider if theirs isn't doing a
good job of filtering.

  jik

On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:
https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.

Oh, I forgot one more thing that I wanted to point out. The other risk of a scam detector with a low success rate and a high false positive rate is that it gives people a false sense of security. They see the scam detector triggering frequently enough that they know it's there and think that a high false positive rate must also mean that it does a good job of detecting real scams, and that makes them /more/ likely to click on the link when they receive a scam that the detector doesn't flag. In short, the scam detector really is doing more harm than good in its current form.   jik On 7/14/18 1:17 PM, Jonathan Kamens wrote: > > OK, so if you want to dismiss the experiences of people commenting on > the bug as anecdotal, then here's some hard data to consider. > > I save all of my spam / scam messages and legitimate email messages > going back for several months so that I can retrain my Bayesian spam > filter when it proves necessary to do so. I was therefore able to > conduct the following experiment. > > I scanned 4,703 spam / scam messages I received between April 13 and > July 13 (92 days) to find out which of them Thunderbird would flag as > a scam. It flagged 38 of them, a success rate of 0.8%. Not all of the > messages in question were ones that I would expect a scam detector to > flag, but most of them are, so I would certainly not consider a scam > detector that only flags 0.8% of my spam messages as scams as > particularly useful or accurate. > > More significantly, I also scanned 12,860 legitimate email messages > which I received between February 13 and July 13 (151 days) to find > out which of them Thunderbird would flag falsely as a scam. It flagged > 105 of them, a rate of 0.82%. The difference of only 0.02% between the > false positive rate and the true positive rate seems to imply that > it's pretty much a crap-shoot whether the scam detector is accurate > for any particular message. > > Note that most of the known spam / scam messages which I fed through > the scam detector were detected and filtered out by my spam filter > when I first received them, i.e., I never actually saw them in my > inbox. If a user is using any halfway decent spam detector, then they > will have a similar experience. That means that they are likely to see > far more false positives than true positives from the scam detector. > > In another message in this thread, Josiah proposed a UX experiment to > determine whether a a lot of false positives from the scam detector > are influencing people to ignore its warnings. With all due respect to > Josiah, we don't need to do that research. The research has already > been done. It's well-established in the cybersecurity field (about > which I can speak with some conviction, seeing as how I've been > working in cybersecurity for nearly 30 years and I'm currently a CISO) > that alert fatigue causes alerts to be ignored. Furthermore, alert > fatigue bleeds between applications, which means that by generating a > lot of false positives in Thunderbird, we're making it more likely > that people will ignore real alerts from other applications. This is > not good for anyone. > > Frankly, if Thunderbird's spam filter is halfway decent (and I don't > know whether it is, because as I said previously, I use bogofilter), > it is probably going to do a far//better job of filtering out scams > than Thunderbird's current scam detector, and given the accuracy of > the detector as illustrated above, I'd advocate ditching it entirely > and sticking with just the spam filter. > > So, I've now presented hard data illustrating just how bad the scam > detector's performance is. Are we still going to insist that we > shouldn't take any action because we only have anecdotal data to work > with? > >   Jonathan Kamens > > On 7/13/18 2:35 PM, Philipp Kewisch wrote: >> Hi Folks, >> >> I do get this on a few messages from people I know, but all in all, >> is there is something that tells me a link does not have the same >> target it seems to have, that alone would be enough for me to have it >> enabled. >> >> Both sides are purely anecdotal iiuc, so telemetry data would be >> great. Unfortunately I believe it is not set up correctly, so >> gathering data is a little more involved. >> >> I am certain though we shouldn't just go turn it off because of a hunch. >> >> Philipp >> >> On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us >> <mailto:jik@kamens.us>> wrote: >> >>> I agree with the many commenters there who have argued that >>> extremely inaccurate scam detection is in fact worse than no scam >>> detection at all, and if indeed it is as inaccurate as people there >>> are claiming it is, it should be turned off by default. >>> >>> Personally, I use bogofilter on my mail server, which catches the >>> vast majority of spam and phishing emails before I ever see them >>> with a very low false-positive rate, so the first thing I do >>> whenever setting up a new Thunderbird profile is disable the spam >>> and scam detection engines in Thunderbird. Therefore, I don't have >>> any first-hand experience speaking to whether the claims of >>> inaccuracy in that ticket are accurate. >>> >>> The thing about Wayne's "We should fix it rather than replace it" >>> argument (in comment 93 >>> <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is that >>> we need to recognize the reality that the resources currently >>> available for substantive Thunderbird development are >>> extraordinarily limited, so unless this issue is bounced to the top >>> of the priority list, it's not going to get any developer work >>> directed at it for a long time. Given that, it really should be >>> turned off if it's so inaccurate that it gives users bad information >>> more often than good. >>> >>> I suppose there is a third alternative, which may or may have been >>> suggested in that ticket (frankly, I'm not going to read over 100 >>> comments to find out). I'm under the impression that Thunderbird's >>> Telemetry support allows the app to capture information about what >>> users do and transmit it anonymously to the developers to give them >>> a better idea about how the app is being used and guide future >>> changes. A good start would be to capture Telemetry support about >>> how often TB marks a message as a potential scam, how often users >>> click the button telling TB that a message in fact is not a scam, >>> and how often users click the button telling TB that a message which >>> wasn't classified as such is a scam (the latter statistic would need >>> to be taken much less seriously than the others, though, since the >>> most likely outcome of a user recognizing a scam message as such is >>> to delete it). That Telemetry data might give us a better idea of >>> how accurate the scam detector is. >>> >>> Regarding the question of how to build a better scam detector, as >>> comment 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> >>> points out, there is a lot of good research into how to build >>> effective scam detectors, so I would suggest the first step in any >>> effort to build a better scam detector into TB would be to consult >>> the research and the experts. >>> >>> Or recognize that spam / scam detection is actually the job of the >>> MTA, not the MUA, and therefore we should get rid of all the spam >>> and scam detection functionality in TB and tell people who want it >>> to switch to a better mail service provider if theirs isn't doing a >>> good job of filtering. >>> >>>   jik >>> >>> On 7/13/18 11:59 AM, Ryan Sipes wrote: >>>> Fellow Thunder Flock Members, >>>> >>>> A long standing bug has been brought to my attention regarding false >>>> positives in scam detection: >>>> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >>>> >>>> I wanted to post this bug here and see if anyone on this list had ideas >>>> on how to address this. Do you think we should keep scam detection >>>> enabled? If so, are there any ways to improve scam detection so that >>>> there are not so many false positives? >>>> >>>> Thoughts and feedback appreciated. >>>> >>> _______________________________________________ >>> Maildev mailing list >>> Maildev@lists.thunderbird.net <mailto:Maildev@lists.thunderbird.net> >>> http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net
JB
Josiah Bruner
Sat, Jul 14, 2018 6:09 PM

For the record my proposed experiment does not test for alert fatigue. I'm
well aware of that research and I'm sure almost everyone agrees with it.

My experiment tests for whether people with scam detection enabled fall for
phishing emails more often than those with it disabled. (this is the main
argument is the bug for disabling)

As far as I'm aware there is no such research available. If you have
counter examples though, please share.

Regards,
Josiah

On Sat, Jul 14, 2018, 1:21 PM Jonathan Kamens jik@kamens.us wrote:

Oh, I forgot one more thing that I wanted to point out. The other risk of
a scam detector with a low success rate and a high false positive rate is
that it gives people a false sense of security. They see the scam detector
triggering frequently enough that they know it's there and think that a
high false positive rate must also mean that it does a good job of
detecting real scams, and that makes them more likely to click on the
link when they receive a scam that the detector doesn't flag.

In short, the scam detector really is doing more harm than good in its
current form.

jik
On 7/14/18 1:17 PM, Jonathan Kamens wrote:

OK, so if you want to dismiss the experiences of people commenting on the
bug as anecdotal, then here's some hard data to consider.

I save all of my spam / scam messages and legitimate email messages going
back for several months so that I can retrain my Bayesian spam filter when
it proves necessary to do so. I was therefore able to conduct the following
experiment.

I scanned 4,703 spam / scam messages I received between April 13 and July
13 (92 days) to find out which of them Thunderbird would flag as a scam. It
flagged 38 of them, a success rate of 0.8%. Not all of the messages in
question were ones that I would expect a scam detector to flag, but most of
them are, so I would certainly not consider a scam detector that only flags
0.8% of my spam messages as scams as particularly useful or accurate.

More significantly, I also scanned 12,860 legitimate email messages which
I received between February 13 and July 13 (151 days) to find out which of
them Thunderbird would flag falsely as a scam. It flagged 105 of them, a
rate of 0.82%. The difference of only 0.02% between the false positive rate
and the true positive rate seems to imply that it's pretty much a
crap-shoot whether the scam detector is accurate for any particular message.

Note that most of the known spam / scam messages which I fed through the
scam detector were detected and filtered out by my spam filter when I first
received them, i.e., I never actually saw them in my inbox. If a user is
using any halfway decent spam detector, then they will have a similar
experience. That means that they are likely to see far more false positives
than true positives from the scam detector.

In another message in this thread, Josiah proposed a UX experiment to
determine whether a a lot of false positives from the scam detector are
influencing people to ignore its warnings. With all due respect to Josiah,
we don't need to do that research. The research has already been done. It's
well-established in the cybersecurity field (about which I can speak with
some conviction, seeing as how I've been working in cybersecurity for
nearly 30 years and I'm currently a CISO) that alert fatigue causes alerts
to be ignored. Furthermore, alert fatigue bleeds between applications,
which means that by generating a lot of false positives in Thunderbird,
we're making it more likely that people will ignore real alerts from other
applications. This is not good for anyone.

Frankly, if Thunderbird's spam filter is halfway decent (and I don't know
whether it is, because as I said previously, I use bogofilter), it is
probably going to do a far better job of filtering out scams than
Thunderbird's current scam detector, and given the accuracy of the detector
as illustrated above, I'd advocate ditching it entirely and sticking with
just the spam filter.

So, I've now presented hard data illustrating just how bad the scam
detector's performance is. Are we still going to insist that we shouldn't
take any action because we only have anecdotal data to work with?

Jonathan Kamens
On 7/13/18 2:35 PM, Philipp Kewisch wrote:

Hi Folks,

I do get this on a few messages from people I know, but all in all, is
there is something that tells me a link does not have the same target it
seems to have, that alone would be enough for me to have it enabled.

Both sides are purely anecdotal iiuc, so telemetry data would be great.
Unfortunately I believe it is not set up correctly, so gathering data is a
little more involved.

I am certain though we shouldn't just go turn it off because of a hunch.

Philipp

On 13. Jul 2018, at 6:59 PM, Jonathan Kamens jik@kamens.us wrote:

I agree with the many commenters there who have argued that extremely
inaccurate scam detection is in fact worse than no scam detection at all,
and if indeed it is as inaccurate as people there are claiming it is, it
should be turned off by default.

Personally, I use bogofilter on my mail server, which catches the vast
majority of spam and phishing emails before I ever see them with a very low
false-positive rate, so the first thing I do whenever setting up a new
Thunderbird profile is disable the spam and scam detection engines in
Thunderbird. Therefore, I don't have any first-hand experience speaking to
whether the claims of inaccuracy in that ticket are accurate.

The thing about Wayne's "We should fix it rather than replace it" argument
(in comment 93 https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93)
is that we need to recognize the reality that the resources currently
available for substantive Thunderbird development are extraordinarily
limited, so unless this issue is bounced to the top of the priority list,
it's not going to get any developer work directed at it for a long time.
Given that, it really should be turned off if it's so inaccurate that it
gives users bad information more often than good.

I suppose there is a third alternative, which may or may have been
suggested in that ticket (frankly, I'm not going to read over 100 comments
to find out). I'm under the impression that Thunderbird's Telemetry support
allows the app to capture information about what users do and transmit it
anonymously to the developers to give them a better idea about how the app
is being used and guide future changes. A good start would be to capture
Telemetry support about how often TB marks a message as a potential scam,
how often users click the button telling TB that a message in fact is not a
scam, and how often users click the button telling TB that a message which
wasn't classified as such is a scam (the latter statistic would need to be
taken much less seriously than the others, though, since the most likely
outcome of a user recognizing a scam message as such is to delete it). That
Telemetry data might give us a better idea of how accurate the scam
detector is.

Regarding the question of how to build a better scam detector, as comment
97 https://bugzilla.mozilla.org/show_bug.cgi?id=623198 points out,
there is a lot of good research into how to build effective scam detectors,
so I would suggest the first step in any effort to build a better scam
detector into TB would be to consult the research and the experts.

Or recognize that spam / scam detection is actually the job of the MTA,
not the MUA, and therefore we should get rid of all the spam and scam
detection functionality in TB and tell people who want it to switch to a
better mail service provider if theirs isn't doing a good job of filtering.

jik
On 7/13/18 11:59 AM, Ryan Sipes wrote:

Fellow Thunder Flock Members,

A long standing bug has been brought to my attention regarding false
positives in scam detection:https://bugzilla.mozilla.org/show_bug.cgi?id=623198

I wanted to post this bug here and see if anyone on this list had ideas
on how to address this. Do you think we should keep scam detection
enabled? If so, are there any ways to improve scam detection so that
there are not so many false positives?

Thoughts and feedback appreciated.


Maildev mailing list
Maildev@lists.thunderbird.net
http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net


Maildev mailing list
Maildev@lists.thunderbird.net
http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net

For the record my proposed experiment does not test for alert fatigue. I'm well aware of that research and I'm sure almost everyone agrees with it. My experiment tests for whether people with scam detection enabled fall for phishing emails *more often* than those with it disabled. (this is the main argument is the bug for disabling) As far as I'm aware there is no such research available. If you have counter examples though, please share. Regards, Josiah On Sat, Jul 14, 2018, 1:21 PM Jonathan Kamens <jik@kamens.us> wrote: > Oh, I forgot one more thing that I wanted to point out. The other risk of > a scam detector with a low success rate and a high false positive rate is > that it gives people a false sense of security. They see the scam detector > triggering frequently enough that they know it's there and think that a > high false positive rate must also mean that it does a good job of > detecting real scams, and that makes them *more* likely to click on the > link when they receive a scam that the detector doesn't flag. > > In short, the scam detector really is doing more harm than good in its > current form. > > jik > On 7/14/18 1:17 PM, Jonathan Kamens wrote: > > OK, so if you want to dismiss the experiences of people commenting on the > bug as anecdotal, then here's some hard data to consider. > > I save all of my spam / scam messages and legitimate email messages going > back for several months so that I can retrain my Bayesian spam filter when > it proves necessary to do so. I was therefore able to conduct the following > experiment. > > I scanned 4,703 spam / scam messages I received between April 13 and July > 13 (92 days) to find out which of them Thunderbird would flag as a scam. It > flagged 38 of them, a success rate of 0.8%. Not all of the messages in > question were ones that I would expect a scam detector to flag, but most of > them are, so I would certainly not consider a scam detector that only flags > 0.8% of my spam messages as scams as particularly useful or accurate. > > More significantly, I also scanned 12,860 legitimate email messages which > I received between February 13 and July 13 (151 days) to find out which of > them Thunderbird would flag falsely as a scam. It flagged 105 of them, a > rate of 0.82%. The difference of only 0.02% between the false positive rate > and the true positive rate seems to imply that it's pretty much a > crap-shoot whether the scam detector is accurate for any particular message. > > Note that most of the known spam / scam messages which I fed through the > scam detector were detected and filtered out by my spam filter when I first > received them, i.e., I never actually saw them in my inbox. If a user is > using any halfway decent spam detector, then they will have a similar > experience. That means that they are likely to see far more false positives > than true positives from the scam detector. > > In another message in this thread, Josiah proposed a UX experiment to > determine whether a a lot of false positives from the scam detector are > influencing people to ignore its warnings. With all due respect to Josiah, > we don't need to do that research. The research has already been done. It's > well-established in the cybersecurity field (about which I can speak with > some conviction, seeing as how I've been working in cybersecurity for > nearly 30 years and I'm currently a CISO) that alert fatigue causes alerts > to be ignored. Furthermore, alert fatigue bleeds between applications, > which means that by generating a lot of false positives in Thunderbird, > we're making it more likely that people will ignore real alerts from other > applications. This is not good for anyone. > > Frankly, if Thunderbird's spam filter is halfway decent (and I don't know > whether it is, because as I said previously, I use bogofilter), it is > probably going to do a far better job of filtering out scams than > Thunderbird's current scam detector, and given the accuracy of the detector > as illustrated above, I'd advocate ditching it entirely and sticking with > just the spam filter. > > So, I've now presented hard data illustrating just how bad the scam > detector's performance is. Are we still going to insist that we shouldn't > take any action because we only have anecdotal data to work with? > > Jonathan Kamens > On 7/13/18 2:35 PM, Philipp Kewisch wrote: > > Hi Folks, > > I do get this on a few messages from people I know, but all in all, is > there is something that tells me a link does not have the same target it > seems to have, that alone would be enough for me to have it enabled. > > Both sides are purely anecdotal iiuc, so telemetry data would be great. > Unfortunately I believe it is not set up correctly, so gathering data is a > little more involved. > > I am certain though we shouldn't just go turn it off because of a hunch. > > Philipp > > On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us> wrote: > > I agree with the many commenters there who have argued that extremely > inaccurate scam detection is in fact worse than no scam detection at all, > and if indeed it is as inaccurate as people there are claiming it is, it > should be turned off by default. > > Personally, I use bogofilter on my mail server, which catches the vast > majority of spam and phishing emails before I ever see them with a very low > false-positive rate, so the first thing I do whenever setting up a new > Thunderbird profile is disable the spam and scam detection engines in > Thunderbird. Therefore, I don't have any first-hand experience speaking to > whether the claims of inaccuracy in that ticket are accurate. > > The thing about Wayne's "We should fix it rather than replace it" argument > (in comment 93 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) > is that we need to recognize the reality that the resources currently > available for substantive Thunderbird development are extraordinarily > limited, so unless this issue is bounced to the top of the priority list, > it's not going to get any developer work directed at it for a long time. > Given that, it really should be turned off if it's so inaccurate that it > gives users bad information more often than good. > > I suppose there is a third alternative, which may or may have been > suggested in that ticket (frankly, I'm not going to read over 100 comments > to find out). I'm under the impression that Thunderbird's Telemetry support > allows the app to capture information about what users do and transmit it > anonymously to the developers to give them a better idea about how the app > is being used and guide future changes. A good start would be to capture > Telemetry support about how often TB marks a message as a potential scam, > how often users click the button telling TB that a message in fact is not a > scam, and how often users click the button telling TB that a message which > wasn't classified as such is a scam (the latter statistic would need to be > taken much less seriously than the others, though, since the most likely > outcome of a user recognizing a scam message as such is to delete it). That > Telemetry data might give us a better idea of how accurate the scam > detector is. > > Regarding the question of how to build a better scam detector, as comment > 97 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> points out, > there is a lot of good research into how to build effective scam detectors, > so I would suggest the first step in any effort to build a better scam > detector into TB would be to consult the research and the experts. > > Or recognize that spam / scam detection is actually the job of the MTA, > not the MUA, and therefore we should get rid of all the spam and scam > detection functionality in TB and tell people who want it to switch to a > better mail service provider if theirs isn't doing a good job of filtering. > > jik > On 7/13/18 11:59 AM, Ryan Sipes wrote: > > Fellow Thunder Flock Members, > > A long standing bug has been brought to my attention regarding false > positives in scam detection:https://bugzilla.mozilla.org/show_bug.cgi?id=623198 > > I wanted to post this bug here and see if anyone on this list had ideas > on how to address this. Do you think we should keep scam detection > enabled? If so, are there any ways to improve scam detection so that > there are not so many false positives? > > Thoughts and feedback appreciated. > > > _______________________________________________ > Maildev mailing list > Maildev@lists.thunderbird.net > http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net > > _______________________________________________ > Maildev mailing list > Maildev@lists.thunderbird.net > http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net >
JK
Jonathan Kamens
Sat, Jul 14, 2018 6:16 PM

I've already explained why an alert with a high false positive and low
true positive rate makes people more likely to fall for the thing the
alert is trying to protect against. I explained it in the very message
to which you replied.

This is accepted wisdom in the security community. It also seems
patently obvious to me as someone who has worked in this field for 30 years.

It seems clear at this point that the folks making the decisions about
what to do about this issue are not going to listen to the data, the
feedback from users, or the experienced experts. I will therefore bow
out of this discussion now so as not to waste more of my time or yours.

Regards,

  jik

On 7/14/18 2:09 PM, Josiah Bruner wrote:

For the record my proposed experiment does not test for alert fatigue.
I'm well aware of that research and I'm sure almost everyone agrees
with it.

My experiment tests for whether people with scam detection enabled
fall for phishing emails more often than those with it disabled.
(this is the main argument is the bug for disabling)

As far as I'm aware there is no such research available. If you have
counter examples though, please share.

Regards,
Josiah

On Sat, Jul 14, 2018, 1:21 PM Jonathan Kamens <jik@kamens.us
mailto:jik@kamens.us> wrote:

 Oh, I forgot one more thing that I wanted to point out. The other
 risk of a scam detector with a low success rate and a high false
 positive rate is that it gives people a false sense of security.
 They see the scam detector triggering frequently enough that they
 know it's there and think that a high false positive rate must
 also mean that it does a good job of detecting real scams, and
 that makes them /more/ likely to click on the link when they
 receive a scam that the detector doesn't flag.

 In short, the scam detector really is doing more harm than good in
 its current form.

   jik

 On 7/14/18 1:17 PM, Jonathan Kamens wrote:
 OK, so if you want to dismiss the experiences of people
 commenting on the bug as anecdotal, then here's some hard data to
 consider.

 I save all of my spam / scam messages and legitimate email
 messages going back for several months so that I can retrain my
 Bayesian spam filter when it proves necessary to do so. I was
 therefore able to conduct the following experiment.

 I scanned 4,703 spam / scam messages I received between April 13
 and July 13 (92 days) to find out which of them Thunderbird would
 flag as a scam. It flagged 38 of them, a success rate of 0.8%.
 Not all of the messages in question were ones that I would expect
 a scam detector to flag, but most of them are, so I would
 certainly not consider a scam detector that only flags 0.8% of my
 spam messages as scams as particularly useful or accurate.

 More significantly, I also scanned 12,860 legitimate email
 messages which I received between February 13 and July 13 (151
 days) to find out which of them Thunderbird would flag falsely as
 a scam. It flagged 105 of them, a rate of 0.82%. The difference
 of only 0.02% between the false positive rate and the true
 positive rate seems to imply that it's pretty much a crap-shoot
 whether the scam detector is accurate for any particular message.

 Note that most of the known spam / scam messages which I fed
 through the scam detector were detected and filtered out by my
 spam filter when I first received them, i.e., I never actually
 saw them in my inbox. If a user is using any halfway decent spam
 detector, then they will have a similar experience. That means
 that they are likely to see far more false positives than true
 positives from the scam detector.

 In another message in this thread, Josiah proposed a UX
 experiment to determine whether a a lot of false positives from
 the scam detector are influencing people to ignore its warnings.
 With all due respect to Josiah, we don't need to do that
 research. The research has already been done. It's
 well-established in the cybersecurity field (about which I can
 speak with some conviction, seeing as how I've been working in
 cybersecurity for nearly 30 years and I'm currently a CISO) that
 alert fatigue causes alerts to be ignored. Furthermore, alert
 fatigue bleeds between applications, which means that by
 generating a lot of false positives in Thunderbird, we're making
 it more likely that people will ignore real alerts from other
 applications. This is not good for anyone.

 Frankly, if Thunderbird's spam filter is halfway decent (and I
 don't know whether it is, because as I said previously, I use
 bogofilter), it is probably going to do a far//better job of
 filtering out scams than Thunderbird's current scam detector, and
 given the accuracy of the detector as illustrated above, I'd
 advocate ditching it entirely and sticking with just the spam filter.

 So, I've now presented hard data illustrating just how bad the
 scam detector's performance is. Are we still going to insist that
 we shouldn't take any action because we only have anecdotal data
 to work with?

   Jonathan Kamens

 On 7/13/18 2:35 PM, Philipp Kewisch wrote:
 Hi Folks,

 I do get this on a few messages from people I know, but all in
 all, is there is something that tells me a link does not have
 the same target it seems to have, that alone would be enough for
 me to have it enabled.

 Both sides are purely anecdotal iiuc, so telemetry data would be
 great. Unfortunately I believe it is not set up correctly, so
 gathering data is a little more involved.

 I am certain though we shouldn't just go turn it off because of
 a hunch.

 Philipp

 On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us
 <mailto:jik@kamens.us>> wrote:
 I agree with the many commenters there who have argued that
 extremely inaccurate scam detection is in fact worse than no
 scam detection at all, and if indeed it is as inaccurate as
 people there are claiming it is, it should be turned off by
 default.

 Personally, I use bogofilter on my mail server, which catches
 the vast majority of spam and phishing emails before I ever see
 them with a very low false-positive rate, so the first thing I
 do whenever setting up a new Thunderbird profile is disable the
 spam and scam detection engines in Thunderbird. Therefore, I
 don't have any first-hand experience speaking to whether the
 claims of inaccuracy in that ticket are accurate.

 The thing about Wayne's "We should fix it rather than replace
 it" argument (in comment 93
 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is
 that we need to recognize the reality that the resources
 currently available for substantive Thunderbird development are
 extraordinarily limited, so unless this issue is bounced to the
 top of the priority list, it's not going to get any developer
 work directed at it for a long time. Given that, it really
 should be turned off if it's so inaccurate that it gives users
 bad information more often than good.

 I suppose there is a third alternative, which may or may have
 been suggested in that ticket (frankly, I'm not going to read
 over 100 comments to find out). I'm under the impression that
 Thunderbird's Telemetry support allows the app to capture
 information about what users do and transmit it anonymously to
 the developers to give them a better idea about how the app is
 being used and guide future changes. A good start would be to
 capture Telemetry support about how often TB marks a message as
 a potential scam, how often users click the button telling TB
 that a message in fact is not a scam, and how often users click
 the button telling TB that a message which wasn't classified as
 such is a scam (the latter statistic would need to be taken
 much less seriously than the others, though, since the most
 likely outcome of a user recognizing a scam message as such is
 to delete it). That Telemetry data might give us a better idea
 of how accurate the scam detector is.

 Regarding the question of how to build a better scam detector,
 as comment 97
 <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> points
 out, there is a lot of good research into how to build
 effective scam detectors, so I would suggest the first step in
 any effort to build a better scam detector into TB would be to
 consult the research and the experts.

 Or recognize that spam / scam detection is actually the job of
 the MTA, not the MUA, and therefore we should get rid of all
 the spam and scam detection functionality in TB and tell people
 who want it to switch to a better mail service provider if
 theirs isn't doing a good job of filtering.

   jik

 On 7/13/18 11:59 AM, Ryan Sipes wrote:
 Fellow Thunder Flock Members,

 A long standing bug has been brought to my attention regarding false
 positives in scam detection:
 https://bugzilla.mozilla.org/show_bug.cgi?id=623198

 I wanted to post this bug here and see if anyone on this list had ideas
 on how to address this. Do you think we should keep scam detection
 enabled? If so, are there any ways to improve scam detection so that
 there are not so many false positives?

 Thoughts and feedback appreciated.
 _______________________________________________
 Maildev mailing list
 Maildev@lists.thunderbird.net
 <mailto:Maildev@lists.thunderbird.net>
 http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net
 _______________________________________________
 Maildev mailing list
 Maildev@lists.thunderbird.net <mailto:Maildev@lists.thunderbird.net>
 http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net
I've already explained why an alert with a high false positive and low true positive rate makes people more likely to fall for the thing the alert is trying to protect against. I explained it in the very message to which you replied. This is accepted wisdom in the security community. It also seems patently obvious to me as someone who has worked in this field for 30 years. It seems clear at this point that the folks making the decisions about what to do about this issue are not going to listen to the data, the feedback from users, or the experienced experts. I will therefore bow out of this discussion now so as not to waste more of my time or yours. Regards,   jik On 7/14/18 2:09 PM, Josiah Bruner wrote: > For the record my proposed experiment does not test for alert fatigue. > I'm well aware of that research and I'm sure almost everyone agrees > with it. > > My experiment tests for whether people with scam detection enabled > fall for phishing emails *more often* than those with it disabled. > (this is the main argument is the bug for disabling) > > As far as I'm aware there is no such research available. If you have > counter examples though, please share. > > Regards, > Josiah > > On Sat, Jul 14, 2018, 1:21 PM Jonathan Kamens <jik@kamens.us > <mailto:jik@kamens.us>> wrote: > > Oh, I forgot one more thing that I wanted to point out. The other > risk of a scam detector with a low success rate and a high false > positive rate is that it gives people a false sense of security. > They see the scam detector triggering frequently enough that they > know it's there and think that a high false positive rate must > also mean that it does a good job of detecting real scams, and > that makes them /more/ likely to click on the link when they > receive a scam that the detector doesn't flag. > > In short, the scam detector really is doing more harm than good in > its current form. > >   jik > > On 7/14/18 1:17 PM, Jonathan Kamens wrote: >> >> OK, so if you want to dismiss the experiences of people >> commenting on the bug as anecdotal, then here's some hard data to >> consider. >> >> I save all of my spam / scam messages and legitimate email >> messages going back for several months so that I can retrain my >> Bayesian spam filter when it proves necessary to do so. I was >> therefore able to conduct the following experiment. >> >> I scanned 4,703 spam / scam messages I received between April 13 >> and July 13 (92 days) to find out which of them Thunderbird would >> flag as a scam. It flagged 38 of them, a success rate of 0.8%. >> Not all of the messages in question were ones that I would expect >> a scam detector to flag, but most of them are, so I would >> certainly not consider a scam detector that only flags 0.8% of my >> spam messages as scams as particularly useful or accurate. >> >> More significantly, I also scanned 12,860 legitimate email >> messages which I received between February 13 and July 13 (151 >> days) to find out which of them Thunderbird would flag falsely as >> a scam. It flagged 105 of them, a rate of 0.82%. The difference >> of only 0.02% between the false positive rate and the true >> positive rate seems to imply that it's pretty much a crap-shoot >> whether the scam detector is accurate for any particular message. >> >> Note that most of the known spam / scam messages which I fed >> through the scam detector were detected and filtered out by my >> spam filter when I first received them, i.e., I never actually >> saw them in my inbox. If a user is using any halfway decent spam >> detector, then they will have a similar experience. That means >> that they are likely to see far more false positives than true >> positives from the scam detector. >> >> In another message in this thread, Josiah proposed a UX >> experiment to determine whether a a lot of false positives from >> the scam detector are influencing people to ignore its warnings. >> With all due respect to Josiah, we don't need to do that >> research. The research has already been done. It's >> well-established in the cybersecurity field (about which I can >> speak with some conviction, seeing as how I've been working in >> cybersecurity for nearly 30 years and I'm currently a CISO) that >> alert fatigue causes alerts to be ignored. Furthermore, alert >> fatigue bleeds between applications, which means that by >> generating a lot of false positives in Thunderbird, we're making >> it more likely that people will ignore real alerts from other >> applications. This is not good for anyone. >> >> Frankly, if Thunderbird's spam filter is halfway decent (and I >> don't know whether it is, because as I said previously, I use >> bogofilter), it is probably going to do a far//better job of >> filtering out scams than Thunderbird's current scam detector, and >> given the accuracy of the detector as illustrated above, I'd >> advocate ditching it entirely and sticking with just the spam filter. >> >> So, I've now presented hard data illustrating just how bad the >> scam detector's performance is. Are we still going to insist that >> we shouldn't take any action because we only have anecdotal data >> to work with? >> >>   Jonathan Kamens >> >> On 7/13/18 2:35 PM, Philipp Kewisch wrote: >>> Hi Folks, >>> >>> I do get this on a few messages from people I know, but all in >>> all, is there is something that tells me a link does not have >>> the same target it seems to have, that alone would be enough for >>> me to have it enabled. >>> >>> Both sides are purely anecdotal iiuc, so telemetry data would be >>> great. Unfortunately I believe it is not set up correctly, so >>> gathering data is a little more involved. >>> >>> I am certain though we shouldn't just go turn it off because of >>> a hunch. >>> >>> Philipp >>> >>> On 13. Jul 2018, at 6:59 PM, Jonathan Kamens <jik@kamens.us >>> <mailto:jik@kamens.us>> wrote: >>> >>>> I agree with the many commenters there who have argued that >>>> extremely inaccurate scam detection is in fact worse than no >>>> scam detection at all, and if indeed it is as inaccurate as >>>> people there are claiming it is, it should be turned off by >>>> default. >>>> >>>> Personally, I use bogofilter on my mail server, which catches >>>> the vast majority of spam and phishing emails before I ever see >>>> them with a very low false-positive rate, so the first thing I >>>> do whenever setting up a new Thunderbird profile is disable the >>>> spam and scam detection engines in Thunderbird. Therefore, I >>>> don't have any first-hand experience speaking to whether the >>>> claims of inaccuracy in that ticket are accurate. >>>> >>>> The thing about Wayne's "We should fix it rather than replace >>>> it" argument (in comment 93 >>>> <https://bugzilla.mozilla.org/show_bug.cgi?id=623198#c93>) is >>>> that we need to recognize the reality that the resources >>>> currently available for substantive Thunderbird development are >>>> extraordinarily limited, so unless this issue is bounced to the >>>> top of the priority list, it's not going to get any developer >>>> work directed at it for a long time. Given that, it really >>>> should be turned off if it's so inaccurate that it gives users >>>> bad information more often than good. >>>> >>>> I suppose there is a third alternative, which may or may have >>>> been suggested in that ticket (frankly, I'm not going to read >>>> over 100 comments to find out). I'm under the impression that >>>> Thunderbird's Telemetry support allows the app to capture >>>> information about what users do and transmit it anonymously to >>>> the developers to give them a better idea about how the app is >>>> being used and guide future changes. A good start would be to >>>> capture Telemetry support about how often TB marks a message as >>>> a potential scam, how often users click the button telling TB >>>> that a message in fact is not a scam, and how often users click >>>> the button telling TB that a message which wasn't classified as >>>> such is a scam (the latter statistic would need to be taken >>>> much less seriously than the others, though, since the most >>>> likely outcome of a user recognizing a scam message as such is >>>> to delete it). That Telemetry data might give us a better idea >>>> of how accurate the scam detector is. >>>> >>>> Regarding the question of how to build a better scam detector, >>>> as comment 97 >>>> <https://bugzilla.mozilla.org/show_bug.cgi?id=623198> points >>>> out, there is a lot of good research into how to build >>>> effective scam detectors, so I would suggest the first step in >>>> any effort to build a better scam detector into TB would be to >>>> consult the research and the experts. >>>> >>>> Or recognize that spam / scam detection is actually the job of >>>> the MTA, not the MUA, and therefore we should get rid of all >>>> the spam and scam detection functionality in TB and tell people >>>> who want it to switch to a better mail service provider if >>>> theirs isn't doing a good job of filtering. >>>> >>>>   jik >>>> >>>> On 7/13/18 11:59 AM, Ryan Sipes wrote: >>>>> Fellow Thunder Flock Members, >>>>> >>>>> A long standing bug has been brought to my attention regarding false >>>>> positives in scam detection: >>>>> https://bugzilla.mozilla.org/show_bug.cgi?id=623198 >>>>> >>>>> I wanted to post this bug here and see if anyone on this list had ideas >>>>> on how to address this. Do you think we should keep scam detection >>>>> enabled? If so, are there any ways to improve scam detection so that >>>>> there are not so many false positives? >>>>> >>>>> Thoughts and feedback appreciated. >>>>> >>>> _______________________________________________ >>>> Maildev mailing list >>>> Maildev@lists.thunderbird.net >>>> <mailto:Maildev@lists.thunderbird.net> >>>> http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net > _______________________________________________ > Maildev mailing list > Maildev@lists.thunderbird.net <mailto:Maildev@lists.thunderbird.net> > http://lists.thunderbird.net/mailman/listinfo/maildev_lists.thunderbird.net >