Abstract
It is argued that data should be defined as information on properties of units of analysis. Epistemologically it is important to establish that what is considered data by somebody need not be data for somebody else. This article considers the nature of data and “big data” and the relation between data, information, knowledge and documents. It is common for all these concepts that they are about phenomena produced in specific contexts for specific purposes and may be represented in documents, including as representations in databases. In that process, they are taken out of their original contexts and put into new ones and thereby data loses some or all their meaning due to the principle of semantic holism. Some of this lost meaning should be reestablished in the databases and the representations of data/documents cannot be understood as a neutral activity, but as an activity supporting the overall goal implicit in establishing the database. To utilize (big) data (as it is the case with utilizing information, knowledge and documents) demands first of all the identification of the potentials of these data for relevant purposes. The most fruitful theoretical frame for knowledge organization and data science is the social epistemology suggested by Shera (1951). One important aspect about big data is that they are
often unintentional traces we leave during all kinds of activities. Their potential to inform somebody about something is therefore less direct compared to data that have been produced intentionally as, for example, scientific databases.
often unintentional traces we leave during all kinds of activities. Their potential to inform somebody about something is therefore less direct compared to data that have been produced intentionally as, for example, scientific databases.
Originalsprog | Engelsk |
---|---|
Tidsskrift | Knowledge Organization |
Vol/bind | 45 |
Udgave nummer | 8 |
Sider (fra-til) | 685-708 |
Antal sider | 24 |
ISSN | 0943-7444 |
DOI | |
Status | Udgivet - dec. 2018 |