Temporal and behavioral patterns in the use of Wikipedia
Wikipedia stands as the most important wiki-based platform and continues providing the overall society with a vast set of contents and media resources related to all the branches of knowledge. Undoubtedly,Wikipedia constitutes one of the most remarkable facts in the evolution of encyclopedias and, also, a complete revolution in the area of knowledge management. Perhaps, its most innovative aspect is the underlying approach that promotes the collaboration and cooperation of users in the building of contents in a voluntary and altruistic manner. The growth of Wikipedia has never stopped since its beginning as well as its popularity. In fact, the number of visits to its different editions has placed its web site within the top-six most visited pages all over the Internet. Such kind of success has spread the use of Wikipedia beyond typical academic environments and has made it become a complete mass phenomenon. Due to this significant relevance, Wikipedia has revealed as a topic of increasing interest for the research community. However, most of the developed research is concerned with the quality and reliability of the offered contents. This previous research focuses on subjects such as reputation and trust, or addresses topics related to the evolution of Wikipedia and its growth tendencies. By contrast, this thesis is aimed to provide and empirical study and an in-depth analysis about the manner in which the different editions Wikipedia are being used by their corresponding communities of users. In this way, our main objective is the finding of temporal and behavioral patterns describing the different kinds of contents and interactions requested by Wikipedia users. Users¿ requests are expressed in the form of URLs submitted to Wikipedia as a part of the traffic directed to its supporting servers. The analysis presented here, basically, consists in the characterization of this traffic and has been developed by parsing and filtering the information elements extracted from the URLs contained in it. As we, necessarily, have had to work with a sample of all the requests to Wikipedia due to their incommensurable volume, we have, first, validated our results comparing them with trusted sources. After having analyzed the traffic to Wikipedia during a whole year, this study presents a complete characterization of the different types of requests that make part of it. Furthermore, we have found several patterns related to the temporal distributions of such kind of requests as well as to the actions and contents involved in them. The influence of the most frequently searched topics and other contents positively considered by the community, as the featured articles, in the attention that articles get is also considered as a matter of interest. Finally, we have also analyzed the different categories of articles that attract more visits and search operations in the considered editions of Wikipedia. Most of the objectives accomplished here are based on the results provided by the application developed ad-hoc to feed this study. The software engineering of this tool has been undertaken under the WikiSquilter project. We expect that this application can serve as a useful tool to characterize the traffic directed to wiki-based sites, particularly to any project supported by theWikimedia Foundation. Up to this work, no other analysis had been undertaken to study the use of Wikepedia in such a wide and thoroughgoing way. We hope that our efforts and results can serve as a significant contribution in the examination of the dynamics of use when interacting with knowledge management platforms like Wikipedia.
Tesis Doctoral leída en la Universidad Rey Juan Carlos de Madrid en 2011. Director de la Tesis: Jesús M. González Barahona
- IA - Tesis Doctorales