OSINT - Image Analysis or More Where, When, and Metadata [Guest Diary]
[This is a Guest Diary by Thomas Spangler, an ISC intern as part of the SANS.edu BACS program]
A picture is worth a thousand words, as the saying goes. Using open-source information and basic image analysis can be an valuable tool for investigators. The purpose of this blog is to demonstrate the power of image analysis and the associated tools for open-source intelligence (OSINT). Having recently completed SANS SEC497, I was inspired to share the power of image analysis in providing valuable information for investigations. This post will provide a step-by-step approach using a random image [1] pulled from the internet.
SAFETY FIRST
Always scan a file or URL prior to retrieving a target image. This action is particularly useful when retrieving information from suspicious or unknown websites. A tool like VirusTotal [2] makes this step very easy.
First, select your scan type: File, URL, or Search. In the case of a file, it can be dragged and dropped on the screen.
In this case, I used a known PDF file to generate the sample result shown below.
Now we are clear to proceed with the image analysis…
TARGET IMAGE
Our target image was randomly selected from the NY Times website.
Credit: Filip Singer, EPA
WHERE WAS THIS IMAGE TAKEN
A natural first question might be: where this image was taken? OSINT analysts use many tools, including image analysis, to answer questions like this one. As you will see, image analysis alone cannot solve this question. Other tools like Google searches, translation tools, and metadata can be combined with image analysis to provide discrete clues that integrate together into an answer.
Potentially identifiable or unique markings…
In looking for image clues, focus on context (e.g. bridge collapse and flooding), unique markers (e.g. signs, buildings, bridges), and geography.
With these clues in hand, we can now use tools like Google Lens [3] and Yandex [4] (if your organization or agency permits its use because of the Russian origin) for reverse image lookups and text-based searches. While most people think of Google searches in terms of text, Google Lens is the image search equivalent, which can be used to find additional clues. In this case, I used Google Lens with the original image and the image clues mentioned above to find relevant matches. Below are the Google Lens matches obtained from a search on the original image:
From the Google Lens results, the images from www.lusa.pt and TAG24 seem to be similar matches. Note the TAG24 description indicated Dresden and is written in German. Upon visiting the TAG24 website [5], we find a different image of the same location and an article in German.
Using another important OSINT tool, Google Translate, we can translate some of the text to English in order to find the exact bridge and location in question.
Voila…Carola Bridge. A simple Google text search on Carola Bridge turns up an article from Euronews [6] that confirms the image location at the Carola Bridge in Dresden, Germany. We can also use a Google Dork…maps:carola bridge…to find a map of the location:
WHEN
From the Euronews article, we also know that the bridge collapsed sometime between 11-12 September 2024 in the middle of the night.
An AP Article [7] that also turned up in the previous google search indicated that “crews were alerted around 3am”. And, an Engineering News Record article [8] confirms the collapse occurred on 11 September 2024. A Deutche Welle article confirms that demolition of the fallen structure began on 13 September 2024.
We can conclude that this picture was taken sometime between 3am local time on 11 Sept 2024 and daylight hours on 13 Sept 2024. With further investigation, using Google Street View and similar tools, we could have probably narrowed the timeline down even further.
METADATA
I wanted to touch on one other important topic…metadata. Metadata (as shown in the details below from the reference image) presents interesting information such as location, size, imaging device, date, and time for the image in question. Original images, videos, and files usually contain a treasure chest of information in the form of metadata. Using Exiftool [10], the following data is returned on the target file in this blog:
It includes some basic information about the image size, encoding process, etc., but with original images, location, camera type, date, and time will all likely be included. These pieces of metadata could drastically speed up any OSINT investigation.
CONCLUSION
In conclusion, imagery can be an important starting point for OSINT investigations. However, more cyber tools than just image analysis must be employed to answer some basic questions like who, where, and when. In certain cases, an analyst needs to pay close attention to their own attribution (“being found”) when conducting an investigation. Instead of using live web searches from a local machine, an analyst may need to use sock puppet accounts, VPN protection, and/or cloud-based hosts and even tools like Google Cache and the Wayback Machine for archived web sites to protect their identities and the fact that a target is being investigated.
Thank you to SEC497 instructor Matt Edmondson for peaking my interest in OSINT and the skills developed during the course.
[1] nytimes.com
[2] virustotal.com
[3] https://chromewebstore.google.com/detail/download-google-lens-for/miijkofiplfeonkfmdlolnojlobmpman?hl=en
[4] Yandex.com/images
[5] https://www.tag24.de/thema/naturkatastrophen/hochwasser/hochwasser-dresden/hochwasser-in-dresden-pegel-prognosen-werden-sich-bestaetigen-3317729#google_vignette
[6] https://www.euronews.com/my-europe/2024/09/12/major-bridge-partially-collapses-into-river-in-dresden
[7] https://apnews.com/article/dresden-germany-bridge-collapse-carola-bridge-ad1ebf71f396d8984d2e79f9e6ba3f06
[8] https://www.enr.com/articles/59283-dramatic-bridge-failure-surprises-dresden-germany-officials
[9] https://www.dw.com/en/dresden-rushes-to-remove-collapsed-bridge-amid-flood-warning/a-70215802
[10] https://exiftool.org/
[11] https://www.sans.edu/cyber-security-programs/bachelors-degree/
-----------
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu
DNS Reflection Update and Odd Corrupted DNS Requests
Occasionally, I tend to check in on what reflective DNS denial of service attacks are doing. We usually see steady levels of attacks. Usually, they attempt to use spoofed requests for ANY records to achieve the highest possible amplification. Currently, I am seeing these two records used (among others):
ANY nlrb.gov
The response for this query may be up to 5,826 bytes in size. With a query payload size of 37 bytes, this leads to a rather impressive implication. The original name server appears to do the right thing, and it ignores EDNS0, but that, of course, doesn't help with open resolvers.
ANY ncca.mil
This domain is a bit odd. I only receive empty responses for ANY, NS, or other queries I tried. Maybe this domain was fixed after it got abused for DDoS attacks.
ANY fnop.net
The response for this domain is also truncated. Likely also fixed.
"Fixing" Amplification via ANY records
There are a few other defensive techniques that show up more often. Google's domain name service returns a "Not Implemented" error for ANY queries:
% dig ANY dshield.org
; <<>> DiG 9.10.6 <<>> ANY dshield.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOTIMP, id: 27119
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
A few years ago, RFC 8492 was published which spe,cifically allows truncating ANY responses. I see more and more domains returning the "HINFO" record "RFC8492" instead of the full ANY response.
Corrupt ANY Requests
But, while looking into the DNS responses, I also saw some odd malformed queries.
These are odd and somewhat interesting to a packet-focused person:
(or as text:
09:37:49.571420 IP 45.148.10.248.18177 > 70.91.145.9.53: 17767+ [1au] ANY? o^Cco. (33)
0x0000: 4500 003d 762e 4000 f211 0291 2d94 0af8 E..=v.@.....-...
0x0010: 465b 9109 4701 0035 0029 0000 4567 0100 F[..G..5.)..Eg..
0x0020: 0001 0000 0000 0001 046f 0363 6f00 00ff .........o.co...
0x0030: 0001 0000 29ff ff00 0000 0000 00 ....)........
I highlighted the IP header in yellow and underlined it with a dashed line. The UDP header is underlined using dots and highlighted in green. In red and enclosed in a box, you will see the hostname.
A couple of observations to start about the IP and UDP headers:
- The TTL is large (0xf2, or 242), exceeding more normal starting TTLs of 128 and 64.
- The UDP checksum is 0, which is valid for IPv4 and just indicates not to verify the UDP checksum
But the real interesting part is the hostname. DNS encodes hostnames in a zero terminated length-value format. Each label is preceded by a one byte length field. For example, "isc.sans.edu" would be encoded as "03"isc"04"sans"03"edu"00".
The sequence above,
04 6F 03 63 6F 00
implies a single label with a length of 4 bytes. But one byte of the label is "03", which is not a printable ASCII character and not a valid byte value for a hostname. It is more likely that the author of this denial of service tool "messed up" and meant to say:
01 6F 02 63 6F 00
Which would be a valid domain, "o.co", and it could work to amplify queries. The ANY record for the domain is short but contains invalid data:
;; QUESTION SECTION:
;o.co. IN ANY
;; ANSWER SECTION:
o.co. 900 IN NSEC \000.o.co. A NS SOA MX TXT RRSIG NSEC DNSKEY
The hostname in the NSEC record starts with a NULL byte! No idea what this is about. Let me know if you can figure it all out :)
One more reason to love DNS. There is always a surprise waiting for you.
If you are interested in a video walkthrough, see this YouTube recording:
---
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|
Comments