PRNG

June 19, 2019

Summary

Strong Points

Well structured paper, serving an underrepresented area(email tracking vs web tracking)

Authors bent over backwards trying to preserve privacy (used only public mail boxes, commented on how long services keep emails)

Probably the first good large email dataset in the wild (2 million emails).

Defended the methodology for every step (eg: how they picked email providers? search rankings)

Got side observations (email providers keep emails for much longer than they advertise, emails contain personal info like SSN EIN etc)

Weak Points

Reused methodology for tracking stuff (reference 17) Only domains supporting user specified addresses are used. Dataset not representative to a normal user inbox PIIs not validated because of ethical concerns

Improvements

More effort towards privacy related results (not attempted because of ethical issues) would be good, even with obfuscating private data.