As web crawlers scour the Internet's web sites for content, they catalog pieces of potentially useful information. Search engines, such as Google, now provide advanced search functions that allow attackers to build a clearer picture of the network that they plan to attack later.
In particular, the following types of information are easily found:
Employee contact details and information
DDI telephone numbers
Physical addresses of offices from which the employees are based
Details of internal email systems
DNS layout and naming convention, including domains and hostnames
Documents that reside on publicly accessible servers
Direct-dial telephone numbers are especially useful to determined attackers, who may later launch war dialing and other telephone-based attacks. It is very difficult for organizations and companies to prevent this information from being ascertained; for example, it is made freely available every time a user posts to a mailing list with his signature. To manage this risk more effectively, companies should go through public record querying exercises to ensure that the information an attacker can collect doesn't lead to a compromise.
Using a powerful advanced search function, Google can indirectly map networks and gather potentially useful information. The advanced search function itself is directly accessible at http://www.google.com/advanced_search?hl=en. In terms of the functionality, searches can be refined in the following ways:
Exclude pages that don't include specific words or phrases, for example
Filter results using over 30 specific languages
Search for text strings within supported file types, such as:
Adobe PDF (.pdf)
Adobe PostScript (.ps)
Microsoft Word and Rich Text Format (.doc and .rtf)
Microsoft Excel (.xls)
Microsoft PowerPoint (.ppt)
Search for a text string in specific areas of a document:
Title of the document
Body text of the document
Links within the document
Search under specific domains
Google can easily enumerate staff at the CIA, with their email addresses, telephone, and fax numbers. An example of this follows in Figure 3-1, showing a Google search launched using the search string:
+"ucia.gov" +tel +fax
The possibilities are virtually endless with Google searches, depending on the exact type of data you are trying to mine. For example, if you simply want to enumerate all the web servers Google knows under the sony.com domain, you can submit a query string such as sony site:.sony.com.
An effective security-related application of a Google search is to list misconfigured web servers with directories that can be indexed and browsed freely. Figure 3-2 displays the search results of the following string:
allintitle: "index of /" site:.redhat.com
Often enough, web directories that provide file listings contain interesting files that aren't web-related (such as Word and Excel documents). An example of this is a large bank that stored its BroadVision rollout plans (including IP addresses and administrative usernames and passwords) in an indexed /cmc_upload/ directory. An automated scanner, such as N-Stealth, can't identify the directory, but Google can crawl through following links from elsewhere on the Internet.
Netcraft (http://www.netcraft.com) is another web querying site that actively scours Internet web sites. You can use it to map web farms and networks, as well as display the operating platform of each host and details of the web services running.
Internet newsgroup searches hold similar types of information as web searches. For example, using http://groups.google.com, you can issue a query of fedworld.gov, revealing usernames, machine names, accessible public servers, and other information as depicted in Figure 3-3.
After conducting web and newsgroup searches, an initial understanding of the target networks in terms of domain names and offices should be realized. NIC and DNS querying are used next to probe further and identify Internet-based points of presence, along with details of hostnames and operating platforms used.