The Visualization of the Vector Space Model in Searching for Immigration News in the East Nusa Tenggara Region
Abstract
The Immigration Office is a public service agency involved in various activities, many of which are documented and published in the form of news articles. The sheer volume of these published articles can create challenges when trying to locate specific ones. One approach to improve search efficiency is through ranking, a subfield of information retrieval. Information retrieval involves the process of finding materials, typically documents, within an unstructured dataset, often consisting of text, to fulfill information requirements from a large collection. One technique for document retrieval is the utilization of the Vector Space Model (VSM). VSM employs principles from linear algebra, particularly the vector space, to develop a document model for conducting searches for the desired documents. A column vector representation is used to transform input documents. Another key concept is measuring the proximity between two vectors by calculating the angle they form and then sorting the data from the smallest to the largest angle. This establishes the ranking order, from the most relevant to the least relevant documents. Among the weighting algorithms, the tf-idf algorithm stands out, as it considers the frequency of word occurrences in each online document and the frequency of online documents containing the word. This study elucidates the visualization of the VSM in the search for documents related to immigration in the East Nusa Tenggara region.