Geolocation through Artificial Intelligence #OSINT

Joey Tribbiani (Friends) apunta a la televisión donde aparece él mismo

If you don’t live under a rock and use social media, you’ve surely witnessed or starred in those posts at your favorite coffee shop, close to home, at work, on the road, or at those unforgettable Project X-style epic parties. We love to share those moments, don’t we?

Often, we don’t even realize the mountain of information we give to the world every time we post something on our social networks. We expose ourselves to the Internet without thinking about the possible consequences. It’s like venturing into the digital jungle without a map, not knowing what predators lurk.

But do you know what’s crazy? That with every post we are dropping clues about our location, our daily routine, our activities, our family, our friends, among many other things. It’s as if we are leaving digital breadcrumbs for anyone to follow, revealing our life to an invisible and potentially dangerous audience.

And it’s not just the information we know we share, but also the information we share without even being aware of it, because of course, let’s not forget about metadata. When, for example, we take a photo with our cell phone, that image can contain a lot of information that, at first glance, we could overlook. From the exact location where the photo was taken, to the make, model and software version of the device used, and even details about the camera settings. Small details that, at first glance, seem harmless but, contrary to what we might think, reveal a lot of information, making us vulnerable to possible attacks by those who know how to look for it.

Left: Example of metadata extracted from a photo captured with the cell phone. Right: Example of metadata extracted from a photo captured with the cell phone.

While it is true that many social networks have implemented measures to clean up much of this metadata and protect our privacy, there are always ways to extract information. Especially with the help of technologies such as Artificial Intelligence. For example, an algorithm trained to recognize visual patterns in images could identify elements that reveal the geographic location where the photo was taken.

And here comes the interesting part! Let’s look at a simple example of how Artificial Intelligence can help us get information from posts exposed on the Internet. As I mentioned before, many social networks remove metadata, among which could be the location data of photos, making it difficult to trace their origin. Avoiding wasting time manually reviewing thousands of images is crucial. So how about creating a small program in Python? With which, simply by passing the images to an Artificial Intelligence, we can geolocate them and visualize on a map where they were taken.

The first thing we will need is to find an Artificial Intelligence capable of geolocating a location from the images we provide it. For this, I have selected picarta.ai.

This AI thoroughly analyzes each image, looking at the metadata. However, when metadata is absent, the advanced AI model intervenes to predict the precise location in the world where the photo was taken.

In addition, they offer an API that allows us to perform up to 100 monthly queries for free by simply creating an account.

The second thing we will need is an interactive map where we can add placeholders to the locations that picarta.ai returns as a result. For this purpose, I have selected Leaflet.js, an easy to use open source JavaScript library.

Now that we have everything ready, it’s time to program. First, we will start by creating a “.py” file and import the modules we will need:Now that we have everything ready, it’s time to program. First, we will start by creating a “.py” file and import the modules we will need:

import os
import requests
import json
import base64
import webbrowser

os: Provides functions for interacting with the operating system, such as file and directory manipulation.
requests: Allows HTTP requests to be made in Python, such as sending GET or POST requests to web servers.
json: Facilitates serialization and deserialization of data in JSON format, commonly used to exchange data between applications.
base64: Provides functions to encode and decode data in Base64 format, useful for converting binary data into readable text and vice versa.
webbrowser: Provides an interface to open and control web browsers from Python, allowing automation of tasks related to Internet browsing.

Next, we will create a function that I have called ‘rastreator‘. This function will make requests to the picarta.ai API to obtain the positioning results, which will be saved in a JSON file along with the image in base64 format.

def rastreator(image_path, api_token):
    url = "https://picarta.ai/classify"
    headers = {"Content-Type": "application/json"}

    with open(image_path, "rb") as image_file:
        img_data = base64.b64encode(image_file.read()).decode('utf-8')

    payload = {"TOKEN": api_token, "IMAGE": img_data}

    response = requests.post(url, headers=headers, json=payload)

    if response.status_code == 200:
        result = response.json()
        result["image_base64"] = img_data
        return result
    else:
        print("La solicitud falló para la imagen", image_path)
        return None

Next, we will create another function, which I have named ‘add_markers_to_html‘, which will take care of adding the placeholders to the Leaflet.js map, along with the image from which the location has been extracted.

def add_markers_to_html(results):
    markers_script = ""
    for result in results:
        filename = list(result.keys())[0]
        ai_lat = result[filename]["ai_lat"]
        ai_lon = result[filename]["ai_lon"]
        markers_script += f'''
        var marker = L.marker([{ai_lat}, {ai_lon}]).addTo(map);
        marker.bindPopup('<img src="data:image/jpeg;base64,{result[filename]["image_base64"]}" style="max-width:300px;">');
        '''    
    return markers_script

Finally, in the main function ‘main’, we will define the logic to process all the images in the folder we specify when running the program, obtaining their locations through the ‘rastreator‘ function, saving the results in a JSON file and generating an interactive HTML file that displays a map with markers obtained through the ‘add_markers_to_html‘ function for each location. When the program is finished running, the map will automatically open in the browser so that the results can be visually explored.

def main():
    api_token = "" ## Añadir API KEY
    folder_path = input("Por favor, ingresa la ruta donde se encuentran las imágenes: ")
    output_file = "datos.json"
    html_file = "mapa.html"

    results = []

    for filename in os.listdir(folder_path):
        if filename.endswith((".jpg", ".jpeg", ".png")):
            image_path = os.path.join(folder_path, filename)
            print("Procesando:", image_path)
            result = rastreator(image_path, api_token)
            if result:
                results.append({filename: result})

    with open(output_file, "w") as f:
        json.dump(results, f, indent=4)

    print("Los resultados se han guardado en", output_file)

    markers_script = add_markers_to_html(results)

    with open(html_file, "w") as html:
        html.write(f'''<!DOCTYPE html>
<html>
    <head>
        <title>Mapa de resultados</title>
        <script src="https://unpkg.com/leaflet@1.7.1/dist/leaflet.js"></script>
        <link rel="stylesheet" href="https://unpkg.com/leaflet@1.7.1/dist/leaflet.css" />
    </head>
    <body style="margin: 0;">
        <div id="map" style="width: 100%; height: 100vh;"></div>
            <script>
                var map = L.map('map').setView([0, 0], 2);
                L.tileLayer('https://{{s}}.tile.openstreetmap.org/{{z}}/{{x}}/{{y}}.png', {{
                    attribution: '© OpenStreetMap contributors'
                }}).addTo(map);
                L.control.scale().addTo(map);
                {markers_script}
            </script>
    </body>
</html>'''
)

    print("Se ha generado y abierto el archivo", html_file)

    webbrowser.open(html_file)

For the correct operation of the program, add the following at the end of the code:

if __name__ == "__main__":
    main()

Complete code:

import os
import requests
import json
import base64
import webbrowser

def rastreator(image_path, api_token):
    url = "https://picarta.ai/classify"
    headers = {"Content-Type": "application/json"}

    with open(image_path, "rb") as image_file:
        img_data = base64.b64encode(image_file.read()).decode('utf-8')

    payload = {"TOKEN": api_token, "IMAGE": img_data}

    response = requests.post(url, headers=headers, json=payload)

    if response.status_code == 200:
        result = response.json()
        result["image_base64"] = img_data
        return result
    else:
        print("La solicitud falló para la imagen", image_path)
        return None

def add_markers_to_html(results):
    markers_script = ""
    for result in results:
        filename = list(result.keys())[0]
        ai_lat = result[filename]["ai_lat"]
        ai_lon = result[filename]["ai_lon"]
        markers_script += f'''
        var marker = L.marker([{ai_lat}, {ai_lon}]).addTo(map);
        marker.bindPopup('<img src="data:image/jpeg;base64,{result[filename]["image_base64"]}" style="max-width:300px;">');
        '''    
    return markers_script

def main():
    api_token = "" ## Añadir API KEY
    folder_path = input("Por favor, ingresa la ruta donde se encuentran las imágenes: ")
    output_file = "datos.json"
    html_file = "mapa.html"

    results = []

    for filename in os.listdir(folder_path):
        if filename.endswith((".jpg", ".jpeg", ".png")):
            image_path = os.path.join(folder_path, filename)
            print("Procesando:", image_path)
            result = rastreator(image_path, api_token)
            if result:
                results.append({filename: result})

    with open(output_file, "w") as f:
        json.dump(results, f, indent=4)

    print("Los resultados se han guardado en", output_file)

    markers_script = add_markers_to_html(results)

    with open(html_file, "w") as html:
        html.write(f'''<!DOCTYPE html>
<html>
    <head>
        <title>Mapa de resultados</title>
        <script src="https://unpkg.com/leaflet@1.7.1/dist/leaflet.js"></script>
        <link rel="stylesheet" href="https://unpkg.com/leaflet@1.7.1/dist/leaflet.css" />
    </head>
    <body style="margin: 0;">
        <div id="map" style="width: 100%; height: 100vh;"></div>
            <script>
                var map = L.map('map').setView([0, 0], 2);
                L.tileLayer('https://{{s}}.tile.openstreetmap.org/{{z}}/{{x}}/{{y}}.png', {{
                    attribution: '© OpenStreetMap contributors'
                }}).addTo(map);
                L.control.scale().addTo(map);
                {markers_script}
            </script>
    </body>
</html>'''
)

    print("Se ha generado y abierto el archivo", html_file)

    webbrowser.open(html_file)

if __name__ == "__main__":
    main()

To run the program, we only need to execute the “.py” file that I have named “CapCap.py” using Python, and then provide the path where the images are stored.

End of program execution

It is important to note that this program is not specifically focused on collecting information or images of users in social networks, such as Instagram, for example. It is more oriented to explore possible further uses of such information.

Therefore, we will not see how to create a scrapper to retrieve large amounts of images and related information on social networks; perhaps in future posts? Also, this is the reason why the result is not so impressive, since we lack information such as the date of publication of the images, which prevents us from creating a complete temporal chronology. Such a chronology would provide us with a broader view of a person’s movements, activities and relationships. Also, it is worth clarifying that Artificial Intelligence can make mistakes and that we should not take the result as 100% reliable.

That said, this is the result based on 64 images extracted from the publications of a user in social networks.

With all these considerations, we can see how the combination of technologies such as Artificial Intelligence and interactive mapping tools can provide fascinating insights into the information we inadvertently share on the Internet.

Un blog de

Geolocation through Artificial Intelligence #OSINT

See also in:

Pages

Search

Authors

Archives

Meta