The concept of open data is becoming more and more important, you can read about it in all kind of different media, they talk about the Linked Open Data, Open Data Research etcetera. It is presented by politicals, governments, scientists and others. Since the beginning, it has been a great revolution, but are all data that are shared really open? In this article we present the basic principles of open data.
The 8 Open Government Data Principles
In December of 2007, 30 Open Government defenses met in Sevastopol, California, and created a set of 8 basic principles of open data. In that meeting held between 7th and 8th of December, the foundations were laid to understand why data is used are essential for democracy and for open government.
Because the Internet is a public space, Gracias a que Internet es un espacio público, governments have the opportunity to understand the needs of citizens and therefore citizens can participate actively with governmentGracias a que Internet es un espacio público, . The information in this case becomes an invaluable tool for improving the affairs of the nation.
The California group offered a total of eight fundamental principles to differentiate open government data and help make governments around the world more effective and more transparent. These are the basic principles:
This principle claims that in order to have an informed and participatory society, it is necessary to have all the information (data) generated by the government. Only in this way will we be able to make decisions as a society. Let us also bear in mind that in some way the citizens have already paid for these data with our taxes and we can with all property, claim that they are freely available.
Data or information resources that are not in an electronic format (e.g. values from analog measuring devices) cannot be subject to these principles. However, it is recommended that, as far as possible, these data be converted to a digital format so that they can be shared equally.
Data are collected at the source, with the highest level of granularity, no aggregations or modifications.
Open data must be detailed and blank. This means raw data, without having passed through any filter or being processed. In addition, information must be provided on how the data was obtained and where the original documents to which they refer are located, so that the user can check whether the data has been selected and stored correctly.
It is frequent in the administration to try to expose data after having filtered or added them according to a given criterion. This principle calls for abstaining from it, and for the reuser or citizen to be able to process the original data according to their convenience or interest.
We understand that this principle bases its reasoning on objectivity and transparency and requires that there be no modification to the data in order to be able to study them in depth.
Data shall be made available as soon as necessary to preserve the value of the data.
They must be up to date at all times and available to users. Priority should be given to the dissemination of data that are “time sensitive“.
Data is available for the widest range of users with the widest range of purposes.
This means taking into account how data preparation and publishing options affect people with disabilities and how they can affect users of a variety of software and hardware platforms.
Data should be published with current industry standards and protocols, as well as in alternative formats when required for reuse.
Data is not accessible if it can only be obtained through a web form or if automated tools cannot access it due to a robots.txt file, or any other policy or technological restriction.
However, it is noted that in general the quality of published open data is not high. In fact, there are datasets that contain errors in their distributions, which are incomplete or not very detailed. This makes it impossible for both citizens and reusers to make practical use of them.
Recientemente se ha publicado un estudio que analiza el nivel de madurez de las iniciativas de datos abiertos de los Estados Miembros de la UE. Uno de los indicadores de este estudio es medir la usabilidad de las plataformas. Esto significa si dichas plataformas son funcionales a la hora de facilitar la reutilización de datos que proceden de diversos sectores.
Las notas son menores de las esperadas porque solamente un 44% de los países tiene un nivel óptimo de madurez. España por su parte, se encuentra en el grupo líder en el marco europeo.
5.- Machine processable
Data is reasonably structured to allow automated processing.
In order for open data to be used, they must be properly coded. Free texts are not a substitute for tabulated and structured data, the image of a text is not the same as the text itself. Practically speaking, this indicates that formats such as PDF, any graphics such as JPG or PNG or unstructured texts in a TXT file cannot be processed automatically and must be rejected as open data (or alternatives to them must be included). The format of the published data and the meaning of the published structured data elements must be sufficiently documented.
The data is available to anyone, without registration.
Open data must be available to anyone without prior registration or identification<(strong>. Anonymous access is permitted including access through anonymous proxies.
Some public bodies require identification in order to access data that is public. These records and identifications are barriers to distributing content. For example, a city council that, in order to offer its geospatial data in the open data portal, required its users to register would be violating this principle.
Data is available in a format over which no entity has sole control.
The formats in which the data are presented should preferably be open or at least include open formats among the published ones. Proprietary formats add unnecessary restrictions to those who use them. This means that if data is to be offered in proprietary formats such as Excel or DWG for example, it should at the same time be offered in open formats such as CSV, ODF, XML, SVG, and so on.
Much of the data we think complies with the Open Data precepts is presented in proprietary or unstructured formats and is therefore not easily reusable. It is common to find that open formats are not used, and are even little known, and many companies do not even know that they can derive economic benefit from them.
The data are not subject to any copyright, patent, trademark or regulation. Reasonable privacy and security restrictions are permitted.
Through Open Data Portals, how users use the OGoov Platform, Public Administrations can put at the disposition of the citizenship the data that in the redistribution, reuse and commercialization of these. It is a fundamental principle that they have a license CC0, CC-BY, PDDL, ODC-BY, etc.
Think about the following case: If a government exposes data from its cartography, but when the downloads find a clause that prevents the commercial use of the same implies that the license of such data is not “open” and would be violating this principle.
Data may not be subject to copyright or patent restrictions. However, because government information is a binomial between public records and personal information, a clear distinction must be made between privacy policies and what is in the public domain when displaying them.
After seeing what the basic principles of Open Data are, the question that underlies everything is, do we really make use of this open data? And if not, what are the reasons why this theoretically available information is not exploited?
If you are interested in knowing the answer, we recommend this recent article published by Politica Comunicada with 14 well-founded reasons that accurately analyze the current situation. You will not be surprised to see that among the most frequent reasons is the ignorance or failure of one or several of these basic principles that we have presented today. Until next week.