Google and open data

Google’s commitment to open data

Within the broad spectrum of technological resources that Google works on, open data is among their priorities. Why Google pays special attention to them? What tools and initiatives have they implemented in this sense?

Having data of interest freely available can make a difference for a project to succeed or for the continuous improvement of the operations of any institution or company.

At Google they have all this in mind, so they have developed an entire ecosystem built around the concept of Open Data, offering a large number of resources with vast potential for all open data professionals.

In this article we will talk about the reasons that drive Google to work in favor of open data and how this work is taking shape in different projects and tools.

Open data that supports scientific advance

We have previously discussed how open data is changing our society. Behind this process of change many scientific and technological innovations are found.

The scientific community shows a growing interest in open data, as evidenced by the results of The State of Open Data Report 2019 survey, from which we can draw the following conclusions:

  • 79% support a legal initiative than claims primary investigations should be open.
  • 67% believe those who do not share their data should be penalized, even if the funder has told them to do so.
  • 69% believes openness of data should be a mandatory requirement to qualify for subsidies.
  • 36% fear that if data is freely available it will be misused.

What role does google play in the world of open data?

Google is firmly committed to open data. In fact, many of the projects of this kind are intended to make lives easier for scientific researchers and workers of both private and public companies.

Hence, there is a clear and active willingness to openness of data from this tech giant. Proof of this are the projects we will discuss later.

Why does Google commit to open data?

According to Google, there are 3 clear motivations that lead to openness of data. These are:

  • They set as one of their main goals to make useful, accessible and free information available to everyone.
  • Promote scientific progress beyond Google. In this way, the research community obtains a powerful fuel for their work in the form of data of interest.
  • New Google employees are already familiar with Google’s own tools, so the adaptation process and learning curve is reduced, resulting in being more productive from the very beginning.
Open Data and Google

Google’s tools for open data

Google’s clear commitment to open data is expressed in a number of solutions and technological resources beneficial for the work of Open Data professionals and for the general public.

Perhaps, the most widely used tool is Google Dataset Search, where you can access information stored in different datasets of this tech giant.

What are the options offered by this tool ? On the one hand, the possibility to filter contents by their type. (i.e: text, graphs, table of contents, free access, etc.).

In addition, to improve your search, this data contain metadata to provide more information on datasets. Metadata in which you can view information such as the last update date, its origin, brief description, author, etc.

But this is not the only way in which Google works with aspects related to the openness of data.

Google Cloud Public Datasets, shows their clear propensity towards open data. This solution allows users to access hundreds of different types of public datasets from BigQuery cloud data warehouse.

These data in the cloud are properly prepared to apply Machine Learning processes, in addition, it has specific functionalities for geospatial information, such as satellite images. In addition, they allow integration with tools such as Data Studio, thus being able to shape very visual and attractive reports.

Since we are talking about a visual representation of the data to acquire better knowledge, we must mention the presence of Google Public Data Explorer. This solution allows to plot data offered by institutions both public and academic from the World Bank or the National Statistical Institute.

We can continue with the visual content, specifically, we can access to open images and videos of Google. Regarding the images, Open Images Dataset has over 9 million images that have been annotated with image-level labels and object bounding boxes. In fact, 8.4 objects are labelled per image on average.

With regard to videos, these are localized in YouTube-8M Dataset, This dataset has more than 6 million videos on 4,000 classes and containing 3 average labels per video. A labeling that, as with the images, is done automatically by means of Artificial Intelligence. Both sources are very useful for the development of biometric recognition systems.

Regarding Google’s involvement in open source, we highlight initiatives such as the well-known Android, Chromium, TensorFlow and Kubernets. In addition, Google has contributed greatly to GitHub.

Among the projects related to make useful information accessible and available to the public we highlight those initiatives carried out with data collected from public entities.

Within the above mentioned projects we have Kaggle, a platform specialized in both creation and training community specialists in Machine Learning and data science, which was acquired by Google in 2017. Kaggle has about 20,000 open datasets

Google has a wide presence in the world of open data, as demonstrated in above-mentioned initiatives. Thus, it’s interesting to keep an eye on its evolution, as well as on any future projects. We will keep you updated on the latest developments.

Share this post

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on print
Share on email