Hellenic OCR Team co-founders Fotis Fitsilis and George Mikros have published their latest open access article in the Journal of Open Humanities Data! This is the flagship publication on the science and methods behind the Team’s unique composition and approach.
Development and Validation of a Corpus of Written Parliamentary Questions in the Hellenic Parliament
This paper presents the development of the first parliamentary corpus of written questions in the Hellenic Parliament. Moreover, we discuss a well-defined end-to-end process that has been streamlined and optimised to produce high-quality open text data based on parliamentary documents. Based on the above methodology, a representative sample of 2,000 questions from four parliamentary periods in the Hellenic Parliament has been extracted, validated, and placed into an open data repository. Furthermore, open data production is analysed, and several degrees of freedom in its application in alternative data sets are proposed and discussed. Consequently, the authors argue that this method constitutes a transferable and scalable practice that can be used by other representative institutions for the creation and subsequent study of their open data.