Abstract (EN):
In a world increasingly present online, people are leaving a digital footprint, with valuable information scattered on the Web, in an unstructured manner, beholden to the websites that keep it. While there are potential harms in being able to access this information readily, such as enabling corporate surveillance, there are also significant benefits when used, for example, in journalism or investigations into Human Trafficking. This paper presents an approach for retrieving domain-specific information present on the Web using Social Media platforms as a gateway to other content existing on any website. It begins by identifying relevant profiles, then collecting links shared in posts to webpages related to them, and lastly, extracting and indexing the information gathered. The tool developed based on this approach was tested for a case study in the domain of Human Trafficking, more specifically in sexual exploitation, showing promising results and potential to be applied in a real-world scenario.
Language:
English
Type (Professor's evaluation):
Scientific
No. of pages:
16