DorkPot: A Honeypot-based Analysis of Google Dorks

Florian Quinkert, Eduard Leonhardt, Thorsten Holz

Workshop on Measurements, Attacks, and Defenses for the Web (MADWeb), San Diego, California, USA, February 2019 - ** Best Paper Award **


Attackers use search engines to find vulnerable systems and interesting information such as passwords, hidden files, or other kinds of sensitive information on the Internet. Besides common search terms, they use advanced search parameters called Google dorks to find only results with specific strings in the URL or files with a particular extension. So far, only a few works have empirically studied Google dorks, e.g., if they are still in use, which Google dorks attackers use, and how often as well as how old the used Google dorks are.

In this paper, we study this type of attacks from a different perspective and present DorkPot, a dynamic, low-interaction webserver honeypot to detect Google dork related requests and, thereby, analyze such attacks in the wild. DorkPot uses Google dorks as input and creates a website for each Google dork, which matches the Google dork’s content, e.g., strings in the title field or the URL. Hence, we ensure the particular website can be found later via a Google search with the corresponding Google dork. To evaluate our prototype implementation, we deployed DorkPot with more than 4,000 Google dorks as input on ten instances of a cloud provider. Throughout more than ten months, we collected almost 9,000 clicks for 371 different Google dorks. Our analysis reveals that the top-ten Google dorks were responsible for more than 50% of the clicks, were mostly published in mid-2017 and searched for various online devices, such as IP cameras or routers, as well as passwords and database backups. In particular, three of the top-ten Google dorks targeted Internet of Things devices and another two searched for passwords and related files with authentication information.