Institutions of Higher Education in Germany
Source:vignettes/higher_ed_institutions.Rmd
higher_ed_institutions.Rmd
As of March 13, 2024, the Federal Statistical Office of Germany (Destatis) lists a total of 427 institutions of higher education in the country. The table below lists most of these institutions, together with links providing a wide variety of additional information.
Data source
The data is sourced from the following tables and lists:
The key tables for the student statistics (table series 372) provided by the Federal Statistical Agency (Destatis) on its data collection portal (Erhebungsportal). There are four key tables / csv files containing data on institutions of higher education in Germany: 1.) Hochschule, 2.) HochschulStandort, 3.) HochschuleErsteinschreibung, and 4.) HochschulFachbereich, each with slight differences in how and where institutions are counted (e.g. when a university has subsidiary campuses in other cities and German states, or in the classification of university hospitals),
the list of all institutions of higher education in Germany provided by the German Rector’s Conference (HRK). The Federal Statistical Agency data together with the HRK list are the “most authoritative German numbers” that exist.
the German and English Wikipedia article for each institution,
the Wikidata entry for each institution,
Logos are sourced as data URIs from Wikimedia Commons (preferred), or from the German or English Wikipedia entry. If available, SVGs are preferred to raster formats.
as described in
vignette("ror")
, we use the global Research Organization Registry (ROR) REST API to retrieve additional data and identifiers which are then matched to the existing data.
Identifiers
If available, the following identifiers for each institution are included:
the 9-digit Participant Identification Code (PIC) used in the EU Funding & Tenders Portal (Wikidata Property P5785),
the identifier for the GEPRIS database of funded research projects, published by the German Research Foundation (DFG) (Wikidata Property P4870),
the German Research Institution (GERiT) ID of each institution (see https://gerit.org/en/about),
the ID used by the German Rector’s Conference (HRK),
the ETER (European Tertiary Education Register) ID by the Research Infrastructure for Research and Innovation Studies (RSIS) see https://register.orgreg.joanneum.at/#/about/1). Note that a free account is required to use their services.
the Research Organization Registry (ROR) ID (see
vignette("ror")
and Wikidata Property P6782),the ID of the World Higher Education Database (WHED) by the International Association of Universities.
City
If available, the following websites based on the institution’s city are linked:
the German Foreign Exchange Service page,
the CHE University Ranking page as hosted by the German Academic Exchange Service.
Institutions of higher education
If available, the following websites providing further details on each instituion of higher eduation and their programs are linked:
the CHE University Ranking as hosted by the German Academic Exchange Service,
the studieren.de page,
the CHE University Ranking as hosted by ZEIT Online / Hey Studium.
Rankings
If available, the following rankings are linked:
the CHE University Ranking as hosted by the German Academic Exchange Service,
the Times Higher Education Ranking (Wikidata Property P5586),
Social Media
If available, these social media sites are linked:
Facebook (Wikidata Property P2013),
Instagram (Wikidata Property P2003),
Linkedin (Wikidata Property P4264),
Mastodon (Wikidata Property P4033),
TikTok (Wikidata Property P7085),
Youtube (Wikidata Property P2397).
APIs
The following APIs were helpful in creating the data above:
YouYoube Data API v3 for retrieving the proper channel IDs,
Wiki Core REST API to search for Wikipedia entries and retrieve translated pages,
Wikidata SPARQL query service to obtain data from Wikidata pages (can be queried through the WikidataR package),
SOAP XML web service from Destatis to obtain data from the Federal Statistical Office of Germany (can be queried through the wiesbaden package).
A rant about the Higher Education Compass / Hochschulkompass and data licences
One obvious addition to the ?hochschulen
dataset would
be a link to the Higher
Education Compass / Hochschulkompass by the HRK. Unfortunately, the
compass website uses a lot of obfuscating Javascript to make direct
links very difficult. Now, the HRK is a semi-public institution funded
by the Stiftung
zur Förderung der Hochschulrektorenkonferenz. This foundation mainly
raises its funds through grants from the Länder and from the Federal
Ministry of Education and Research. Moreover, the mostly public members
(around 271 institutions of higher education) of the HRK pay annual
membership fees.
According to the website, “all information found in the Higher Education Compass is authorized by the universities and is updated by employees at the universities themselves”. Against this backdrop, it is incomprehensible that the HRK deems it worthwhile to protect the Higher Education Compass data. Use of this data by the public and/or private competitors to provide better offerings with additional data or different UI should be encouraged, not made more difficult! After all, this is non-confidential data on public institutions largely paid for by taxpayers’ money, administered by staff mostly in the public sector…
The ?hochschulen
dataset consciously consists almost
exclusively of identifiers and links, so as to avoid data licencing
issues. Particularly the private providers, like ZEIT or studieren.de,
naturally have a vested business interest to protect their data and to
discourage direct links as best they can.
About this project
Like most good projects, this one started out of a sense of annoyance. In particular, I looked at the Destatis list of 427 institutions of higher education and realized that they did NOT provide a detailed breakdown of the members institutions in each category. For the time being, only the Länder statistical agencies provide this level of detail. Moreover, I noticed that the list of all institutions of higher education in Germany provided by the German Rector’s Conference (HRK) did not use the instituional identifiers used by the Länder statistical agencies (as one would expect). Nor did their numbers match up with Destatis. I will admit, writing these two slights down, I may come off as easily annoyed - but I can assure you, these sort of things can be heart-wrenching, really!
Annoyed as I was, I started to dig, and dig, and dig a little deeper still. I discovered along the way that trying to count institutions of higher education was by no means a particularly innovative endeavor - after all, ETER does it at the European level, WHED does it for the global level, GERiT and ROR do it for research institutions (slightly different emphasis). Had I known about these initiatives from the start - rather than stumbling upon them during the process - I would have avoided a lot of work. Alas, the proof of the pudding is in the eating…
Aims of this project
A lot of the data assembled in the ?hochschulen
dataset
will find its more permanent home on Wikidata - another great resource I
have only come to appreciate as this project has progressed. For the
time being, the idea is that this data set can save other people time
matching up different data sources and IDs. People who do work on
rankings, or people who work in communication departments come to mind,
but there are certainly lots of other applications.
To Dos
- Add bundesland to gt table
- Consistent use of English in the
?hochschulen
dataset, rename German columns - Document every column of the
?hochschulen
dataset - Crop whitespace for svg logo images, e.g. for the Chemnitz University of Technology (11305)
- The CHE-Ranking is currently listed twice: in the hochschule column and in the ranking column
- The HRK ID in the ID column should have no URL attached to it
- Find a better home for this article (own R package? Personal website?)
- Find a way to get rid of the floating TOC for this vignette
- Cite R packages used
- Updata Wikidata with missing social media links and other identifiers
- Provide csv download of raw data