A workshop on “Advanced analysis and recognition of parliamentary corpora” (ARPC) will be organized at the International Conference on Document Analysis and Recognition conference (ICDAR 2024), which will take place is Athens, Greece. The ARPC workshop will be held on Saturday, 31 August 2024.

Scope and motivation

Parliamentary archives house a wealth of contemporary and historical legislative and administrative documents. The study of these corpora presents remarkable opportunities, enabling the convergence of previously disparate fields like history, political science, and linguistics. This again opens up novel horizons in comprehending parliamentary data and discourse. The envisioned workshop seeks to offer a state-of-the-art platform for the scholarly discourse of innovative methods for the recognition and analysis of these corpora. In doing so, it will facilitate the exploration of advanced techniques that enhance our understanding of parliamentary materials, foster interdisciplinary connections and propel research in this dynamic field.

Workshop description

Data-driven insights from archives have the potential to steer academic research in a variety of fields. This workshop attempts to address the growing importance of employing advanced recognition and analytical methods and tools to decode the complexities within legislative and administrative documents of parliamentary origin. Hence, it is well placed under ICDAR, the premier international event for scientists and practitioners involved in document analysis and recognition. The workshop will deep dive into cutting-edge OCR techniques for parliamentary corpora. Further attention will be placed into recognizing patterns, extracting meaningful insights and understanding the intricate dimensions of contemporary and historical parliamentary discourse. The relevance of this topic lies in its potential to bridge previously isolated domains of research, fostering interdisciplinary collaboration. By connecting history, political science, and linguistics, participants will unlock a richer understanding of legislative evolution, political trends, and linguistic nuances embedded in parliamentary proceedings.

Due to the synergy of perspectives from diverse stakeholders, scientific discussions during the workshop are anticipated to yield outcomes that extend beyond individual disciplines. Envisioned outcomes include novel methodologies, identification of trends and the establishment of a collaborative network that transcends traditional academic silos. The workshop will be supported and promoted by the co-founders of the Hellenic OCR Team, a global network for analyzing parliamentary data. Established back in 2017, the Team represents the first scientific crowdsourcing initiative that aims exclusively at the processing and study of parliamentary textual data.

Acceptable submission topics may include but are not limited to:

  • The recognition of polytonic Greek fonts
  • Recognition of mixed text (printed and handwritten)
  • Parliamentary discourse analysis
  • Historical trends in parliamentary language use
  • Integration of linguistic and political science methodologies in OCR
  • Cross-lingual OCR challenges in parliamentary texts
  • Machine learning approaches for semantic analysis of parliamentary proceedings
  • Ethical considerations in the digitization and analysis of parliamentary records
  • Developing standardized formats for parliamentary data preservation
  • The role of OCR technology in enhancing public access to parliamentary archives
  • Comparative analysis of parliamentary rhetoric across different eras
  • The impact of digital humanities tools on legislative studies
  • Application of natural language processing (NLP) techniques in political discourse analysis
  • Automated categorization and indexing of parliamentary documents
  • Challenges and solutions in digitizing non-standard parliamentary texts.

Accepted papers shall be included in conference proceedings. The Call for Papers is already live.

Timing and duration

The ARPC will take place during the ICDAR 2024 conference (August 30 – September 4, 2024) in Athens, Greece. More specifically, it will be held on Saturday, 31 August 2024. A single half-day long session is planned. The session will include a number of papers (5-6), a keynote talk and a structured discussion.

Target audience

This workshop is designed for parliamentary researchers, data scientists, and policymakers aiming to extract nuanced insights from vast, yet unexplored legislative archives.

Organizing committee  

Fotios Fitsilis has over 20 years of professional experience in science positions within both the private and the public sectors. Since 2009, he is Head of the Department for Scientific Documentation and Supervision and Lead Researcher at the Scientific Service of the Hellenic Parliament. While operating on a global scale, he has been active in fields ranging from telecommunications and logistics to management and good governance, which has included recent papers on e-governance and institutional development including improvements to parliamentary oversight committees. He has been Visiting Professor for parliamentary procedures and legislative drafting at the Universidad Complutense de Madrid. In 2017, he co-founded with George Mikros the Hellenic OCR Team, a crowdsourcing initiative for the study of parliamentary data. Dr. Fitsilis has authored more than 50 scientific publications including five books. He has an academic background in law (LL.M. in International Law), economics (Diploma in Financial Engineering), and engineering (Diploma in Electrical Engineering), while also holding a doctoral degree in electrical engineering. Email: fitsilisf@parliament.gr

George Mikros is currently a Professor and Coordinator of the MA Program of Digital Humanities and Societies at the Department of Middle Eastern Studies at the Hamad Bin Khalifa University in Qatar.  In 2017 with Dr. Fotios Fitsilis, co-founded the Hellenic OCR team with primary aim to collect and analyze Parliamentary data.  Before this, from 1999 to 2019, he served as a Professor of Computational and Quantitative Linguistics at the University of Athens, Greece, where he founded and became the Director of the Computational Stylistics lab. Since 2013 he is also Adj. Professor at the Department of Applied Linguistics at the University of Massachusetts, Boston, USA. He had the position of Research Associate at the Institute for Language and Speech Processing. He was part of research groups that developed important language resources and NLP tools for Modern Greek. Since 1999, he holds the position of Teaching Associate at the Hellenic Open University, and from 2016 to 2019, he was the Director of the Undergraduate Program “Spanish Language and Culture.” Prof. Mikros has authored 5 monographs and over 100 papers published in peer-reviewed journals, conference proceedings, and edited volumes. Since 2007, he has been elected as a Member of the Council of the International Association of Quantitative Linguistics (IQLA). In the 2018 – 2021 period, he served as its president. He is the keynote speaker at many international conferences, workshops, and summer schools related to Digital Humanities, AI, Forensic Linguistics, and Quantitative Linguistics. His main research interests are computational stylistics, quantitative, computational, and forensic linguistics. Email: GMikros@hbku.edu.qa


Pin It on Pinterest