Fake websites have become increasingly pervasive,
generating billions of dollars in fraudulent revenue at the expense
of unsuspecting Internet users. The design and appearance of these
websites makes it difficult for users to manually identify them as
fake. Automated detection systems have emerged as a mechanism for
combating fake websites, however most are fairly simplistic in terms
of their fraud cues and the detection methods employed.
Consequently, existing systems are susceptible to the myriad of
obfuscation tactics used by fraudsters, resulting in highly
ineffective fake website detection performance. In light of these
deficiencies, we propose the development of a new class of fake
website detection systems that are based on statistical learning
theory (SLT). Using a design science approach, a prototype system
was developed to demonstrate the potential utility of this class of
systems. We conducted a series of experiments, comparing the
proposed system against several existing fake website detection
systems on a test bed encompassing 900 websites. The results
indicate that systems grounded in SLT can more accurately detect
various categories of fake websites by utilizing richer sets of
fraud cues in combination with problem-specific knowledge. Given the
hefty cost exacted by fake websites, the results have important
implications for e-commerce and online security.
Keywords: Fake website detection, Internet fraud, design science,
statistical learning theory, information systems development,
website classification