Folksonomies: what and why
For centuries, humans have used various classification systems to provide context and direction to human knowledge. Earlier, the work of applying organization schemes on knowledge has been the work of librarians and information scientists (Quintarelli, 2005), but in recent years this has changed. Kipp and Campbell (2006) explains this with the assumption that the “World Wide Web is a complex, adaptive system” and in complex systems, there are no centralized indexing and cataloguing systems. Professionals created high quality metadata, but this work was time-consuming, which made it difficult to scale and keep up with the huge amount of new content produced. In a time when everybody could become a publisher and an unprecedented amount of web content can be produced due to lower technology and cost barriers, there was an emerging need for an indexing and categorizing system for these non-trained, non-expert information professionals could master (Quintarelli, 2005). The solution to this was folksonomies.
Thomas Valden Wal coined the term “folksonomy” in 2004, in a mailing list discussion on Flickr’s and Delicious’ “user-defined labels or tags to organize and share information”.
“So the user-created bottom-up categorical structure development with an emergent thesaurus would become a Folksonomy?”,
he wrote, and the term was born.
A “folksonomy”, short for “folk” and “taxonomy”, is a user-generated classification that consists of tags, created in a social environment that usually is shared and open to others. Sometimes this practice is known as collaborative tagging (Golder&Huberman, 2005; Ames&Naaman, 2007; Dellschaft&Staab, 2008), social tagging (Derntl et al, 2010; Strohmaier et al, 2012), social classification, social categorization, etnoclassification and mob indexing (Morville&Rosenfelt, 2006). However, Vander Wal disagrees with these term expansions in a blogpost from 2005, where he claims that a folksonomy “is not collaborative, it is not putting things in to categories, it is not related to taxonomy (more like the antithesis of a taxonomy), etc”.
A folksonomy is created when users tag content or information, such web pages, photos, videos, podcasts, tweets, scientific papers amongst others. Strohmaier et al (2012) elaborate: the term “tagging” refers to a “voluntary activity of users who are annotating resources with term-so-called “tags” – freely chosen from an unbounded and uncontrolled vocabulary”. Ames&Naaman (2007) explains that a tag is an unstructured textual label, and appear as a simple form of metadata (Brooks&Montanez, 2006). Maier & Thalmann (2008 as refered in Derntl et al, 2011) define a tag as a triplet of a user, a keyword and a resource. Tags appear in different formats and with different labels: Tags, hashtags, rel-tags, annotations and tripletags are different tag types, Gupta et al (2011) explains.
In traditional classification systems, terms are arranged in a strict hierarchy with terms relationships like broader and narrower term and synonyms and homonyms are handled by a controlled vocabulary (Kipp&Cambell, 2006). In contrast to this, a folksonomy is created in a flat namespace implying that there is no explicit hierarchy involved and no parent-child or sibling relationship between tags.
Folksonomies are a trade-off between traditional centralized classification and no classification at all, Gupta et al explain. One of their advantages is that the vocabulary in a Folksonomy directly reflects the user’s vocabulary, and comprises tags for both popular content and long-tail content. The latter makes it possible for users to browse and discover new content even in narrow topics, and they reflect the user’s conceptual model without cultural, social, or political bias. This gives power to the people, Quintarelli claims.
However, while controlled vocabulary is exclusionary by nature (Krosky, 2005), tags are often ambiguous, overly personalized and inexact (Guy&Tonkin, 2006), in the sense that users apply tags to documents in many different ways. Systems that comprise tagging often lack mechanisms for handling synonyms, acronyms, homonyms, spelling variations such as misspelling, singular vs plural form, conjugated and compound words, or specialized tags and tags without meaning to others than the taggers themselves. Some tagging systems do not support tags consisting of multiple words, resulting in tags like “vertigovideostillsbbc”. Due to lack of synonym control, tags like “kitten” and “pussycat” appear as two different tags with no relationship between them.
Taggers and their tagging motivation
According to Strohmaier et al (2012), taggers are divided into two main types: the descriptors and the categorizers. The descriptors apply tags that appropriately and relevantly describe the resources being tagged. This implies an open set of tags and an unlimited tag vocabulary, Strohmaier et al continues. For a descriptor, on the other hand, the goal of tagging is to identify the tags that describes the resource best, and chosen tags are close related to the content of the resource being tagged. This makes the tags created by a descriptor suitable for searching and retrieval, Strohmaier et al claims. Categorizers, however, apply tags so that the objects are easier to find later, and the categorizers construct and maintain a personal navigational aid to the tagged resources. Tags are assigned to resources whenever they share some common characteristic, Strohmaier et al claims. Applied tags are very close to the mental model of the user and are therefore a suitable facilitator for navigation and browsing (Strohmaier et al, 2012). Categorizers produce fewer descriptive tags with lower tag agreement among users for a given resource.
Vander Wal states that people apply tags on online information and objects for one’s own retrieval. Joshua Porter agrees and claims in a blogpost from 2006 that “personal value precedes network value”, and that user’s main motivation for tagging links in the website Delicious was later retrieval for oneself. Porter claims that all other usage are secondary. However, researchers agree that users have different purposes or motivations when annotating items in social tagging systems (Cantador et al, 2011; Strohmaier et al, 2012, Amen&Naaman, 2007; Hammond et al, 2005). People tag content on the web for many reasons, and tagging motivation ranges from selfish tagging practices for their own retrieval purposes to more altruistic motives of tagging others content for yet others to retrieve. Their research showed that there were multiple reasons for tagging (this being a non-exhaustive list): future retrieval of items for themselves or others, applying context to content for themselves or others, tagging in order to attract attention on tagged resources from others, tagging for self-presentation or self-referential, tags to express opinion or organize task and more.
Characteristics of tags
Cantador et al claims that users have different purposes when they annotate items in social tagging systems. Tags do not only depict the content of the annotated item, they claim, but sometime also describes subjective qualities and options, or organizational aspects such as self-reference and personal tasks. Strohmaier et al’s claim that the function of each tag is determined by pragmatics rather than semantics.
Cantador et al propose four categories for organizing tags, while Golder&Huberman suggest that tag categorization comprise their purpose. Cantador et al suggest the category content-based tags, which are tags that describe the content of the item, such as objects and living things. Content-based tags overlap to some extent to Golder&Huberman’s identified tag purpose of “what (or who) it is about”, “what it is” and “who owns it”, while context-based tags overlap with purpose “refinding categories”.
The next category is context-based tags, which are tags that provide contextual information about an item, such as location, time and period. Both content-based and context-based tag categories “are nouns denoting physical and non-physical entities, whose definition can be found in dictionary, encyclopedias or thesauri”, Cantador et al claims. They further suggest the category subjective tags, which are tags that express opinions and qualities about an item. This category overlaps with Golder&Huberman proposed tag purpose of “identifying qualities or characteristics”. Further, one finds organizational tags, which are tags that defines personal usage or tasks, or indicate self-references. This overlaps withGolder&Huberman’s purpose of “self reference” and “task organizing”.
Application areas for folksonomies
It is difficult to find application areas where folksonomies are used nowadays. I believe this is because the term “folksonomy” is outdated and is not used to describe modern features. However, the concept of tagging objects in a social setting still exists on the web, in different application areas. The following list is not exhaustive, but might give you a taste of tagging practices nowadays.
Photo sharing: Both Flickr and Instagram are photo sharing services. Flickr was launched in 2004, before the term folksonomy was invented, while Instagram from 2010 represents new and modern photo sharing services. At Instagram one can also share video clips with duration of maximum 15 seconds. In both services, users upload their photos and tags them with whatever keyword they find suitable. In Flickr, the user can only tag their own content, while in Instagram; users can tag any photo, also photos that belong to other users. Access to content is granted on different privacy levels. Tags in Instagram are called “hashtags”.
Blogs/microblogs: There are many blog systems available, and one of the most popular one is WordPress. In November 2015, WordPress powered over 25% of all websites in the world. Not all of them are blogs; however, the functionality in the system is the same. In WordPress, there are functionality to tag on articles or blogposts, and in the same way as with the above-mentioned photo sharing systems, the user can use any tag they seem appropriate. In WordPress tags are used as a navigation tool, so that the readers can find similar or related content. Related content is in this context content that is tagged with identical tags. In WordPress, tags are sometimes used to generate tag-clouds, which is an aggregation of the tags used in the blogposts.
Figur 1Tag cloud of the most popular tags from Flickr. Retrieved from http://www.webopedia.com/TERM/T/tag_cloud.html
Twitter was known in the mid 2000 as a microblogging system, but is now regarded as a social network service. However, the functionality is the same: enables registered users to send and read short 140-character messages called “tweets”. Users apply hashtags in their tweets, and hashtags provides the mean to follow a conversation, a discussion or find related tweets that share the same hashtags.
Social citation systems: Bibsonomy.org is described as a social citation system for literature exchange, where a user can save and organize publication references and bookmarks. Users can tag their references and bookmarks, and discover related literature from others based on tags.
Bookmarking: Delicious (formerly known as del.icio.us) was along with Flickr, one of the first services in the world that offered their users the ability to tag items. Delicious was launched in 2003. In Delicious, the user can tag their bookmarks with freely chosen terms, and the user can see content tagged by others. On Delicious, one also finds aggregated lists of the most recent links with a given tag, for example “folksonomy”: https://delicious.com/tag/folksonomy/alltime.
Media such as music/video: MusicBrainz is a community-maintained open source encyclopedia of music information, where users can tag content, such as artists, records, tracks and the relationship between them. When a user tags content, they contribute “to the project by adding information about your favorite artists and their related works”. In MusicBrainz, tags are comma-separated, and may consist of several words, unlike many other services that offers tagging. Tags are automatically converted to lowercase.
YouTube offers video upload and sharing, and a video can be annotated with tags. Ann Smarty from Internet Marketing Ninjas, explains in a blogpost that “tags help users find your video when they search the site. When users type keywords related to your tags your video will appear in their search results”. The user can apply any keyword, including compound words, and when the user types their keyword, the system suggests terms on the fly.
Folksonomies have a lack of recall.
Lack of precision for the variability of language
Lack of synonym control
Low findability quotients, suitable for serendipity and browsing, not good for searching
Taxonomies derived from tagging systems, simulation of tagging process tag recommendation (Strohmaier et al, 2012)
Folksonomy-based recommendation system (Cantador et al, 2011)
Ames, M. N. M. (2007) Why We Tag: Motivations for Annotation in Mobile and Online Media. SIGCHI conference on Human factors in computing systems, New York, NY, USA. ACM Press.
Brooks, C. H.&N. Montanez. (2006) Improved annotation of the blogosphere via autotagging and hierarchical clustering. WWW ’06: Proceedings of the 15th international conference on World Wide Web, New York, NY, USA. ACM Press.
Cantador, I., I. Konstas&J. M. Jose (2011) Categorising social tags to improve folksonomy-based recommendations. In: Web Semant., 9(1), p. 1-15.
Dellschaft, K.&S. Staab. (2008) Anepistemic dynamic model for tagging systems. HT ’08: Proceedings of the nineteenth ACM conference on Hypertext and hypermedia, New York, NY, USA. ACM.
Derntl, M.et al. (2011) Inclusive social tagging and its support in Web 2.0 services. In: Computers in Human Behavior, 27(4), p. 1460-1466.
Golder, S.&B. A. Huberman (2005) The Structure of Collaborative Tagging Systems.
Gupta, M.et al. (2011) An Overview of Social Tagging and Applications. In: Aggarwal, C. C. (ed.) Social Network Data Analytics: Springer, p. 447-497.
Guy, M.&E. Tonkin (2006) Folksonomies: Tidying up Tags? In: D-Lib Magazine, 12(Number 1), p. 1-15.
Hammond, T.et al. (2005) Social Bookmarking Tools (I). In: D-Lib Magazine.
Kipp, M.&D. G. Campbell (2006) Patterns and Inconsistencies in Collaborative Tagging Systems: An Examination of Tagging Practices. In: Proceedings Annual General Meeting of the American Society for Information Science and Technology.
Mathes, A. (2004) Folksonomies – Cooperative Classification and Communication Through Shared Metadata.
Morville, P.&L. Rosenfeld (2006) Information Architecture for the World Wide Web. 3rd ed. ed. Cambridge, MA: O’Reilly Media.
Porter, J. (2006) The Delicious Lesson [online]. [Blogpost]. URL: http://bokardo.com/archives/the-delicious-lesson/ (3.nov).
Quintarelli, E. (2005) Folksonomies: power to the people. In.
Strohmaier, M., C. Körner&R. Kern (2012) Understanding why users tag: A survey of tagging motivation literature and results from an empirical study. In: Web Semantics: Science, Services and Agents on the World Wide Web, 17 p. 1-11.
Wal, T. V. (2007) Folksonomy [online]. URL: http://vanderwal.net/folksonomy.html (14.nov).
 In this, long-tail content refers to content where only one or very few tags is applied
 http://w3techs.com/blog/entry/wordpress-powers-25-percent-of-all-websites, retrieved 30.11.2015
 From musicbrainz.orgs about page, http://musicbrainz.org/doc/About