|
| 1 | +# Intro: What Is Google Hacking? |
| 2 | + |
| 3 | +- Leverage the power of Google for recon through advanced search operators |
| 4 | +- Pioneered by Johnny Long |
| 5 | + - [Google Hacking for Penetration Testers, Third Edition](https://www.amazon.com/Google-Hacking-Penetration-Testers-Third/dp/0128029641) |
| 6 | + - [Google Hacking Database](https://www.exploit-db.com/google-hacking-database/) |
| 7 | + - Search functionality needs improvement |
| 8 | + - Good for viewing categories |
| 9 | +- <https://github.com/JohnTroony/Google-dorks/blob/master/google-dorks.txt> |
| 10 | + - Good for overall search |
| 11 | + - After we learn a search operator, search here for other applications |
| 12 | + |
| 13 | + |
| 14 | +# Intro: General Info. |
| 15 | + |
| 16 | +- Level: All |
| 17 | +- Course [Notes](https://sts.show/google-hacking-1) and [Errata](https://sts.show/google-hacking-1-errata) |
| 18 | + - Focus on search operators in practical situations |
| 19 | + - Not every Google search operator will be covered |
| 20 | +- Examples |
| 21 | + - Phishing campaign simulation |
| 22 | + - Leveraging financial documents from Board Of Director's meetings |
| 23 | + - Searching nginx logs for error responses while adjusting for output variability |
| 24 | + - How to find admin area source code (or other functionality) |
| 25 | + - Search timestamp ranges within PHP error logs |
| 26 | + |
| 27 | + |
| 28 | +# Intro: But Why? |
| 29 | + |
| 30 | +- From a recon perspective, why is Google hacking advantageous? |
| 31 | + - Passive |
| 32 | + - Hard to trace |
| 33 | + - Google makes the connection to the site, not you |
| 34 | + |
| 35 | + |
| 36 | +# Intro: Additional Help |
| 37 | + |
| 38 | +- Search for `Google Dorking` |
| 39 | + - Synonymous with Google Hacking |
| 40 | +- Helpful websites |
| 41 | + - <http://www.googleguide.com/contents/> |
| 42 | + - <https://www.google.com/advanced_search> |
| 43 | + - Not as powerful |
| 44 | + - Can leverage if you need to get going ASAP |
| 45 | + |
| 46 | + |
| 47 | +# Sensitive Doc Dork: Background |
| 48 | + |
| 49 | +- Scenario |
| 50 | + - Phishing campaign will be aimed at |
| 51 | + - State employees who recently got a raise through the budgeting process |
| 52 | + - "confidential employees" |
| 53 | + - State employees who have access to privileged information |
| 54 | +- Phish email from "HR" |
| 55 | + - "Due to your recent salary adjustment, we need to confirm your banking information. Click <span class="underline">here</span> to confirm your bank account on file" |
| 56 | + - Link will redirect to a fake employee portal that steals login credentials |
| 57 | + - Social Engineering |
| 58 | + - Trust is implicitly built through disclosure of sensitive information |
| 59 | + - This information is commonly found via Google Dorks |
| 60 | + |
| 61 | + |
| 62 | +# Sensitive Doc Dork: Logical Operators |
| 63 | + |
| 64 | +- `filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary` `| "salary schedule")` |
| 65 | + - `()` |
| 66 | + - Logical grouping |
| 67 | + - `OR` |
| 68 | + - Note the uppercase |
| 69 | + - AKA `|` |
| 70 | + - If there isn't an explicit `|`, an `AND` is implied |
| 71 | + - Within text |
| 72 | + - Googling `WPA2 KRACK Vulnerability` |
| 73 | + - Adjacent search operators |
| 74 | + |
| 75 | + |
| 76 | +# Sensitive Doc Dork: `filetype` Operator (1/2) |
| 77 | + |
| 78 | +- `filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary` `| "salary schedule")` |
| 79 | +- NOT |
| 80 | + - `filetype: (doc | pdf | xls | txt | rtf | odt | ppt)` |
| 81 | + - True for all operators |
| 82 | +- [Common file formats indexed by Google](https://support.google.com/webmasters/answer/35287?hl=en) |
| 83 | + - Can search for file extensions not on this list |
| 84 | + - Ex: `filetype:md` |
| 85 | + - Q: Why would this be useful? |
| 86 | + |
| 87 | + |
| 88 | +# Sensitive Doc Dork: `filetype` Operator (2/2) |
| 89 | + |
| 90 | +- `filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary` `| "salary schedule")` |
| 91 | +- `filetype:pdf` |
| 92 | + - Will return |
| 93 | + - this-is-actually-a-pdf.txt |
| 94 | + - [Common file formats indexed by Google](https://support.google.com/webmasters/answer/35287?hl=en) |
| 95 | + - Extensions on this list are guaranteed to have this special property |
| 96 | + - this-isn't-a-pdf.pdf |
| 97 | + - this-is-a-pdf.pdf |
| 98 | + |
| 99 | + |
| 100 | +# Sensitive Doc Dork: `intext` Operator |
| 101 | + |
| 102 | +- `filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary` `| "salary schedule")` |
| 103 | +- Helpful for constraining a search to a document's body |
| 104 | + - Regular Google search can match page titles, items in the url path, etc. |
| 105 | +- `intext:(confidential salary | "salary schedule")` |
| 106 | + - Has `confidential` AND `salary` somewhere in the text body |
| 107 | + - `"salary schedule"` |
| 108 | + - Exact match |
| 109 | + - We leave this search relatively vague to capture other results of interest |
| 110 | + - We don't do `intext:("confidential employee" | "salary schedule")` |
| 111 | +- Problems |
| 112 | + - Look at query |
| 113 | + - Too many false positives |
| 114 | + |
| 115 | + |
| 116 | +# Sensitive Doc Dork: `inurl` Operator |
| 117 | + |
| 118 | +- `filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential |
| 119 | + salary |` `"salary schedule") inurl:(confidential "board approved */*/17")` |
| 120 | +- `confidential` |
| 121 | + - Must be in the url |
| 122 | +- `"board approved */*/17"` |
| 123 | + - Can be in the url or anywhere within the document |
| 124 | + - `*` is expanded to one or more words |
| 125 | + - Cross-check via bold in the results output |
| 126 | + - Search Ex. |
| 127 | +- Doesn't work well for non-words |
| 128 | + - Ex: `inurl:"sid=*"` |
| 129 | + - `sid` is for a PHP session |
| 130 | + - Through a proxy log dork, we can find sensitive urls/url parameters |
| 131 | + |
| 132 | + |
| 133 | +# Proxy Log Dork: Why Search Through Proxy Logs? |
| 134 | + |
| 135 | +1. URL Data leakage |
| 136 | + - Common for all GET requests/responses to be logged at the proxy layer |
| 137 | + - Best practice is to place sensitive information within the POST body |
| 138 | + - Ex: session id, sensitive tokens, SSNs, etc. |
| 139 | + - Often placed within a GET request |
| 140 | + - Ex: <https://example.com/256993ac-ba65-11e7-8e6d-0242ac110003/profile> |
| 141 | + - Ex: <https://example.com/profile?sid=256993ac-ba65-11e7-8e6d-0242ac110003> |
| 142 | +2. Abnormal response codes |
| 143 | + - <https://en.wikipedia.org/wiki/List_of_HTTP_status_codes> |
| 144 | + - Example in next slide |
| 145 | + |
| 146 | + |
| 147 | +# Proxy Log Dork: `AROUND(X)` Operator |
| 148 | + |
| 149 | +- `TERM 1` `AROUND(2)` `TERM 2` |
| 150 | + - `TERM 1` is within 2 words of `TERM 2` |
| 151 | + - Capitalize `AROUND` for more consistent results |
| 152 | +- Useful for searching documents where the ordering of terms can be customized |
| 153 | + - Ex: Logs |
| 154 | +- Nginx Log Ex: |
| 155 | + - `- - - - "GET / HTTP/1.1" STATUS_CODE - - -` |
| 156 | + - Put `-` to simplify |
| 157 | +- `filetype:log inurl:(access.log | error.log) intext:("HTTP/" AROUND(5) 500)` `-site:github.com` |
| 158 | + |
| 159 | + |
| 160 | +# Proxy Log Dork: `site` Operator |
| 161 | + |
| 162 | +- Scopes a search to a particular domain |
| 163 | + - Can even be a TLD |
| 164 | + - `site:.net` |
| 165 | +- `site:github.com` |
| 166 | + - Will match `github.com` and `*.github.com` |
| 167 | + - Leave out `www` to ensure search of all subdomains |
| 168 | +- `filetype:log inurl:(access.log | error.log) intext:("HTTP/" AROUND(5) 500)` `-site:github.com` |
| 169 | + - `-site:github.com` |
| 170 | + - Helps us remove example logs that are false positives |
| 171 | + - Or are they? |
| 172 | + - For targeted search don't discard |
| 173 | + - Stackoverflow for recon |
| 174 | + - Review Ex. |
| 175 | + |
| 176 | + |
| 177 | +# `-` Operator |
| 178 | + |
| 179 | +- `-site:github.com -next -last -reply -"I want to leave this out"` |
| 180 | +- Make sure search results don't contain a given… |
| 181 | + - operator |
| 182 | + - word |
| 183 | + - `-next` will help negate help forum results |
| 184 | + - phrase |
| 185 | + |
| 186 | + |
| 187 | +# Error Log Dork: `X..Y` (Range) Operator |
| 188 | + |
| 189 | +- Finds a given number range |
| 190 | +- `warning error on line php filetype:log 2015..2017` |
| 191 | + - Search php error logs for a given timestamp |
| 192 | + - Ex. |
| 193 | + |
| 194 | + |
| 195 | +# Admin Functionality Dork: `inanchor` Operator |
| 196 | + |
| 197 | +- Finds text within a link/anchor |
| 198 | +- `inanchor:admin site:hackthissite.org` |
| 199 | + - Great way to find admin portals |
| 200 | +- How can we remove some of the clutter from the results? |
| 201 | +- Ex. |
| 202 | + |
| 203 | + |
| 204 | +# Admin Functionality Dork: `intitle` Operator |
| 205 | + |
| 206 | +- Searches through page titles |
| 207 | +- `inanchor:admin site:hackthissite.org -intitle:"view topic"` |
| 208 | +- Why did the source code come up in the results? |
| 209 | +- Ex. |
| 210 | + |
| 211 | + |
| 212 | +# Further Learning |
| 213 | + |
| 214 | +- Overall Google functionality |
| 215 | + - <http://www.googleguide.com/contents/> |
| 216 | +- Operators |
| 217 | + - <https://support.google.com/websearch/answer/2466433?hl=en> |
| 218 | + - `cache:` |
| 219 | +- Tools |
| 220 | + - <https://github.com/Ekultek/Zeus-Scanner> |
| 221 | +- Dork lists |
| 222 | + - [Google Hacking Database](https://www.exploit-db.com/google-hacking-database/) |
| 223 | + - <https://github.com/JohnTroony/Google-dorks/blob/master/google-dorks.txt> |
0 commit comments