For better understanding of the topic “Defense against Google hacking”, we will have a look into what exactly “Google Hacking” means.
Google Hacking or Google Dorking is a hacking technique, where Google search or Google applications are used to find the vulnerabilities in the configurations or source code of the website.
So here comes the question, “how Google search engine can be used to find such vulnerabilities?”
Answer would be: Google search engine provides support for a multitude of operators which gives more power to the normal Google search that we do on daily basis.
Now will have a glimpse on some examples of Google’s advanced operators:
- Link: Sites that have a hyperlink to the URL specified will be returned in the search results.
- Cache: The cache operator will search through Google’s cache and return the results based on those documents.
- Related: The related operator will return results that are “similar” to the page that was specified.
- Intitle: Google will return only results that match the word or phrase entered after the operator, within the title of the page.
We can refine the Google searching techniques with the combination of operators that are included as part of a standard Google query to get the desired output as shown in the example below:
intext: classified company_name - used to find the sensitive confidential information of the corporate company.
intitle: index.of.config - used to get the information about the Web Server configuration which is not intended to be viewed by public.
Hence sensitive information such as confidential data, configuration details, customer username & passwords, directory listings and many more can be extracted by the hackers with simple search results.
Now that we have brief Idea about the Google Hacking, we shall focus on how to prevent our website or organization from being a victim to Google Hackers.\
Defense against Google Hacking:
As we realized the Google Hacking effects from above examples; will see how we can safeguard the URL's that have sensitive information by making sure that are not crawled or displayed in search results.
1. Protecting the server directories with password:
When we have confidential or private content which we don't want to appear in Google search results, the simplest and most effective way to block private URLs from appearing in search results is by storing them in a password-protected directory on site server. Googlebot and all other web crawlers will not be having access to content stored in password-protected directories.
For example: if we are using Apache web server, we can edit the .htaccess file to password-protect the directory on the server
2. Using Meta tags for blocking the search index:
Another way to prevent a page from appearing in Google Search results is by including a noindex Meta tag in the HTML code of the page. When Googlebot crawls that page next time, it will find the noindex Meta tag and will drop that page entirely from Google Search results, regardless of whether other sites link to it.
If the page is blocked by a robots.txt file, the crawler will never see the noindex tag, and the page can still appear in search results, if other pages are linked to it.
3. Use of robots.txt file to block the URL’s:
A robots.txt file is a file at the root of the site which indicates those parts of the sites we don’t want to be accessed by search engine crawlers. This file uses the Robots Exclusion Standard, which is a protocol with a small set of commands that can be used to indicate access to the site by section and by specific kinds of web crawlers (such as mobile crawlers vs desktop crawlers).
Note: There is a temporary solution for immediate removal of the sensitive content of the website from Google’s index by sending a request to Google’s automatic URL removal system after registering through a Google account.
Even with all these preventive measures we can’t say that our site completely safe from Google hackers. Best way to know the security posture of the website is by hiring security professionals to simulate hacking activity through penetration testing, same can be accomplished with the use of Vulnerability scanning tools but only penetration testing will confirm how vulnerable a website really is to Google Hacking as well as other attacks.\
Rate this article: