[IAPR-TC10] Newsletter 156 – November 2023

Welcome to the November edition of the TC10 newsletter.

In this issue, you will find a welcome message for the new educational officer and dataset curator, several calls and reports regarding past and next ICDAR conferences, SSDA summer school and also next ICPR and DAS editions. Along with two calls for nomination from the IAPR association, the editorial of the special Issue: “Advanced Topics in Document Analysis and Recognition” (14 articles) and finally, please find the summary of the last IJDAR issue and a two job offers.

I wish you a pleasant reading,

Christophe Rigaud
IAPR-TC10 Communications Officer

Call for contributions: feel free to contribute to TC10 newsletters, by sending any relevant news, event, notice, open position, dataset or link to us on iapr.tc10[at]gmail.com

1) Upcoming deadlines and events


  • Deadlines:
    • November 14, paper submission firm deadline, Journal-First track of ICDAR 2024
    • November 30, proposal submission for hosting DAS 2024
    • December 15, nominations due for the IAPR/ICDAR Young Investigator Award
  • Events:

2024 and later

  • Deadlines
    • January 20, paper submission deadline, ICPR 2024, Kolkata, India
    • January 31, proposal submission for hosting ICDAR 2027
    • April 5, proposal submission for hosting SSDA 2025
    • March 31, deadline of the IAPR Fellow Awards Call for Nominations
  • Events:
    • February 27-29, conference VISAPP, Rome, Italy
    • August 30-4 sep., conference ICDAR 2024, Athens, Greece
    • December 1-5, conference ICPR 2024, Kolkata, India

2) Welcome to the new TC10 educational officer and dataset curator

The TC10 Committee is delighted to welcome two new members in the team : Momina Moetesum as Educational Officer and Sanket Biwas as Dataset Curator.

Welcome to the TC10 committee.

Jean-Christophe Burie

Sanket Biwas is currently a Ph.D. candidate in the Computer Vision Center (CVC), Universitat Autònoma de Barcelona, Spain

Momina Moetesum is an Assistant Professor in Faculty of Computing at the School of Electrical Engineering and Computer Sciences (SEECS), National University of Sciences and Technology (NUST), Pakistan.

3) Call for Nominations for the 2024 IAPR Fellow Awards

Deadline: March 31, 2024

Full 2024 Nomination Instructions can be found here (.docx)

To initiate a nomination, a nominator must complete and submit an IAPR Fellow Nomination Form. Any member of an IAPR Member Society can serve as nominator, except for the nominee themself and the current members of the Executive and Fellow Committees.

Each nomination must be endorsed by at least one recommendation letter (submitted endorsement form), either from a member of an IAPR Member Society (different from the nominator) or from an IAPR Fellow.

All electronic documents (Nomination and Endorsement forms) must be submitted electronically and will be acknowledged by an email. Submission problems should be reported to the IAPR Webmaster, cc’ing the Fellow Committee Chair, Prof. Umapada Pal, Indian Statistical Institute, Kolkata, India:

To: webmaster@iapr.org
Subject: Submission problem — IAPR Fellowship 2024
CC: umapada@isical.ac.inumapada_pal@yahoo.com

Click here for a list of members of the IAPR Fellow Committee

IAPR appreciates your efforts to support our fellowship program!

4) Call for Nominations for the IAPR/ICDAR Young Investigator Award

Nominations Due: December 15, 2023

The IAPR/ICDAR Award Program is an established program designed to recognize individuals who have made outstanding contributions to the field of Document Analysis and Recognition in one or more of the following areas:

  • Research
  • Training of students
  • Research/Industry interaction
  • Service to the community

There are two awards categories, which have been presented together bi-annually in the past: The Young Investigator Award (less than 40 years old at the time the award is made) and the Outstanding Achievements Award. Because of the new yearly schedule of ICDAR, the two categories will alternate in odd and even years.

For ICDAR 2024, nominations are invited for the following award category:

  • IAPR/ICDAR Young Investigator Award

The award will consist of a token gift and a suitably inscribed certificate. The recipient will be invited to give the opening keynote speech at the ICDAR 2024 conference, introduced by the previous recipient of the award.

The nomination pack should include the following:

  1. A nominating letter (1 page) including a brief citation to be included in the certificate.
  2. Supporting letters (1 page each) from 3 active researchers from at least 3 different countries.

A nomination is usually put forward by a researcher (preferably from a different institution than the nominee) who is knowledgeable of the scientific achievements of the nominee, and who organizes letters of support.

The submission procedure is strictly confidential, and self-nominations are not allowed.

Please send nominations packs electronically to the TC10/11 chairs: Andreas Fischer andreas.fischer@hefr.ch and Jean-Christophe Burie jean-christophe.burie@univ-lr.fr. The deadline for receiving nominations is December 15, 2023 but early submissions are strongly encouraged.

5) ICDAR 2023: ScalDoc workshop report

The ICDAR 2023 workshop on Scaling-up Document Image Understanding aimed at opening the discussion on possible ways to widen our community through the use of data preparation efforts and the definition of large-scale (grand) challenges that drive progress in the field.
Fruitful discussions led to the creation of a Slack channel to gather academic and industrial researchers willing to collaborate on the creation of a “dataset of datasets”.

Please find full report here: https://iapr-tc10.univ-lr.fr/wp-content/uploads/2023/11/ScalDoc-Textual-Report.pdf

This effort should enable the training and evaluation of new foundation models for document understanding.

A first meeting have been scheduled on Wed. 2023-09-13, at 19:00 UTC+2
Don’t hesitate and join us, we are already more than 20!
Slack invitation: https://bit.ly/datasetofdatasets

6) ICDAR 2024: Journal-first Track Call for Papers and Reviewers

Following a feature of ICDAR2019 through ICDAR2023, ICDAR 2024 will again include the option of a journal track that offers the rapid turnaround and dissemination times of a conference while providing the paper length, scientific rigor, and careful review process of an archival journal.

The ICDAR-IJDAR journal track invites high-quality submissions that present original work in the areas of Document Analysis and Recognition appropriate to both the International Conference on Document Analysis and Recognition (ICDAR) and the International Journal on Document Analysis and Recognition (IJDAR). Accepted papers will be published in a special issue of IJDAR and will receive an oral presentation slot at the ICDAR 2024 conference.

The submission site in Springer is now open: https://www.springer.com/journal/10032/updates/26200090

Important Dates

Submissions Due: November 14, 2023 (firm deadline)

Note that the deadline has been moved to 14 November 2023 because of the delay in opening this site. Please check the above website for further submission details and important dates.

For more details and guidelines, please visit: https://www.springer.com/journal/10032/updates/26200090


We would like to encourage anyone interested in doing reviews for this
interesting journal track to contact Prof. Elisa Barney elisa.barney@ltu.se.

7) ICDAR 2024 Call for Tutorials

Call for Tutorials for ICDAR 2024

The ICDAR 2024 Organizing Committee invites proposals for tutorials that
will be held on August 30th to September 4th (the correct final date will
be communicated as soon as possible), before the main conference begins.

Important Dates

Proposals Due: Nov. 30, 2023
Acceptance Notification: Dec. 23, 2023
Dates of Tutorials: Aug. 30 – Sep. 4, 2024

ICDAR 2024 Tutorials should serve one or more of the following objectives:

  • Introduce students and newcomers to major topics of Document Analysis
    and Recognition (DAR) research.
  • Provide instructions on established practices and methodologies.
  • Introduce expert non-specialists to a DAR subarea. Survey a mature
    area of DAR research and/or practice.
  • Motivate and explain a DAR topic of emerging importance.
  • Overview DAR systems for industrial solutions (suggestion for
    researchers in industry).
  • Introduce some recent innovative techniques for DAR research and
    software quality, such as open-source libraries, high-level API, technical
    frameworks for expert developments, etc. (suggestion for expert

An ICDAR tutorial should aim to give a comprehensive overview of a specific
topic related to DAR. A good tutorial should be educational rather than
just a cursory survey of techniques. The topic should be of sufficient
relevance and importance to attract significant interest from the ICDAR
community. Typical tutorial audiences consist of PhD students studying
computer vision, image processing or pattern recognition, but also include
researchers and practitioners from both academia and industry. In order to
facilitate innovative collaboration and interaction between researchers in
academia and industry, the Tutorial Chairs strongly encourage proposals for
industrial tutorials, in which researchers in companies describe DAR
systems and overview industrial solutions to document analysis problems in
real use-case industrial scenarios.

Proposals should be up to 4 pages in length, and should contain the
following information:

  • Title of the tutorial.
  • Scope and motivation. A brief description of the tutorial, suitable
    for inclusion in the conference registration brochure.
  • Preference for the duration (full day or half day). Due to agenda
    constraints, half day tutorials are recommended. If a full day is
    needed,provide a brief justification.
  • A detailed outline of the tutorial. Course description with list of
    topics to be covered, along with a brief outline.
  • Relevance for ICDAR. A description of why the tutorial topic would be
    of interest to a substantial part of the ICDAR audience.
  • Expected target audience in terms of composition and estimated number
    of attendees. Prerequisite knowledge of the ICDAR audience for attending
    the tutorial.
  • Short CV of organizers. A brief CV of the presenter(s), including
    name, postal address, phone number, e-mail address, web page, background in
    the tutorial area (projects, relevant publications or tutorial-level
    articles on the subject), evidence of teaching experience.
  • The name and e-mail address of the corresponding presenter. The
    corresponding presenter should be available for e-mail correspondence
    during the evaluation process, in the case clarifications and discussions
    on the scope and content of the proposal are needed.

The evaluation of the proposal will take into account its general interest
for ICDAR attendees, the quality of the proposal (e.g., a tutorial that
simply lists a set of concepts without any apparent rationale behind them
will not be approved) as well as the expertise and skills of the
presenters. We emphasize that the primary criteria for evaluation will be
whether a proposal is interesting, well-structured, and motivated in
relation to Document Analysis and Recognition, rather than the perceived
experience/standing of the proposer. Last but not least, the tutorial
should attract a meaningful audience, cover hot topics and incorporate new
knowledge to the community. Those submitting a proposal should keep in mind
that tutorials are intended to provide an overview of the field; they
should present reasonably well established information in a balanced way.
Tutorials should not be used to advocate a single avenue of research, nor
should they promote a product.

Tutorial slides must be provided to us for inclusion on the conference
website and also on the TC-10 and TC-11 websites, as educational material.
The ICDAR main conference organizers will handle the tutorial registration
and provide the space, coffee breaks and other facilities required to
organize tutorials (e.g. a room, a projector and a screen).

Submission Guidelines & Inquiries
All proposals should be submitted by electronic mail to the Tutorial
Chairs: – Alicia Fornes afornes@cvc.uab.es – Vincent Christlein

Feedback, comments and/or suggestions would be provided within two weeks of
receiving the proposal. Final acceptance (or rejection) would be decided by
December 23, 2023.
Inquiries should be sent to tutorials-chairs@icdar2024.net or the above

8) SSDA 2023: Report of the last summer school on document analysis

We are happy to announce that the 5th IAPR TC10/TC11 summer school on document analysis was a great success! It took place from July 3 to 7, 2023 in Fribourg and Moléson, Switzerland. 21 PhDs and young researchers from 8 different countries benefited from high-quality lectures by experts in the field:

Apostolos Antonacopoulos, University of Salford, United Kingdom, Large-scale Recognition of Information-rich Documents: From Unreadable Data to Structured Information
Jean-Christophe Burie, University of La Rochelle, France, Analysis and understanding of comics: From the detection of basic elements to the creation of semantic links with classic and deep
learning-based approaches.
Gernot Fink, TU Dortmund University, Germany, Deep Learning for Word Spotting: Foundations and Current Developments.
Andreas Fischer, University of Fribourg and University of Applied Sciences and Arts Western Switzerland, Structural methods for document analysis and recognition: From rule-based models to data-driven deep learning.
Alicia Fornes, Universitat Autonoma de Barcelona and CVC, Spain, Handwriting Recognition in Low Resource Scenarios.
C.V. Jawahar, Towards a deeper understanding of documents, III-T Hyderabad, India.
Koichi Kise, Osaka Prefecture University, Japan, Reading of Reading for Actuating: Augmenting Human Learning by Experiential Supplements.
Rich Kent, CTO of Taina, UK, You can’t hide from tax anymore!, A real world example of how one company has used document analysis and recognition to change the tax industry

The video of some lectures will be available soon on the website: https://ssda2023.isc.heia-fr.ch/.

Alongside the courses, participants were able to actively take part during a pitch and a poster session. A handwriting recognition competition was also held. Many social activities, scientific exchanges and campfires were organized, helping to create and/or strengthen links between participants.

Congratulations to Marco Peers, TU Vienna, who won the best poster award and the prize for excellence.

Finally, we would like to thank all the organizers and sponsors who made this event possible. Special thanks to IAPR for funding 4 grants and reducing accommodation costs for students to DIVA, DIUF, Centenary Fund from University of Fribourg and HEIA for their financial and administrative contributions. Thank you to the speakers for your time and knowledge sharing.

See you soon for the 6th edition!

Anna Scius-Bertrand,
SSDA 23 organizing committee.

9) SSDA 2025: Call for proposal for the next summer school on document analysis

Important Dates

April 5, 2024 Proposal Submission Deadline

Submit Proposals via email to:

Part of the mission of International Association for Pattern Recognition (IAPR) TC11 and TC10 is to promote high quality educational activities related to Reading Systems and Graphics Recognition. Responding to this need, TC10 and TC11 have established a series of summer schools. After the successful organization of summer schools in India, France, Pakistan, Sweden, and Switzerland, we are now soliciting proposals for the organization of the sixth “IAPR TC10/TC11 Summer School on Document Analysis” (SSDA) in 2025.

The “IAPR TC10/TC11 Summer School on Document Analysis” is intended to become the primary educational activity of IAPR TC11 (Reading Systems) and TC10 (Graphics Recognition). The School is meant to be a training activity where participants are exposed to the latest trends and techniques of Reading Systems and Graphics Recognition.

The aim of the School is to provide both an objective and clear overview and an in-depth analysis of the state-of-the-art research in selected topics of Reading Systems and Graphics Recognition. The School should aim to provide a stimulating opportunity for young researchers and PhD students in the field.

Individuals and groups who are interested in Reading Systems and Graphics Recognition are invited to submit proposals for organizing and hosting the 2025 IAPR TC10 / TC11 Summer School. As the previous summer schools were organized in Asia, Europe, and the Sub-continent, organizing teams from the Americas and Africa are encouraged to submit a bid in order to facilitate the envisioned rotational scheme of the IAPR TC10 / TC11 Summer School.

In order to fully plan their bid, it is expected that proposers familiarize themselves with the guidelines for organizing the School first. The Guidelines can be found at the TC11 Web site or click here.

The submission of a bid implies full agreement with the rules and procedures for organizing the School. Especially, this means that organizers will apply for IAPR support and that the event will use the series title “IAPR TC10/TC11 Summer School on Document Analysis” with an optional sub-title denoting a special focus of the respective event.

Please consider submitting a proposal for this increasingly important event for the TC10/TC11 community. If you have questions, please do not hesitate to contact the TC11 and TC10 SSDA representatives: Foteini Simistira Liwicki (TC11 Representative) and Momina Moetesum (TC10 Representative).

Previous events:

As a reference, the 2023 Summer School on Document Analysis was held in Fribourg, Switzerland (URL: https://ssda2023.isc.heia-fr.ch/ )

10) DAS 2024: Call for Proposals

Deadline: November 30, 2023

Submission Method: email to jean-christophe.burie@univ-lr.fr and andreas.fischer@unifr.ch

Document Analysis Systems (DAS) is an IAPR sponsored workshop focusing on system-level issues and approaches. In DAS 2022, it was decided by the participants to hold DAS as a satellite workshop with annual ICDAR starting from 2024 onwards. We are seeking proposals to host the 16th Document Analysis Systems workshop co-located with ICDAR in 2024.

Anyone interested in submitting a proposal to host DAS 2024 in Greece should drop an email to jean-christophe.burie@univ-lr.fr and andreas.fischer@unifr.ch by November 30, 2023.

We are looking forward to receiving high quality proposals and making the first chapter of DAS-ICDAR 2024 a big success!

Jean-Christophe Burie (Chair, TC10)
Andreas Fischer (Chair, TC11)

11) ICPR 2024: Call for Papers

Greetings from the ICPR-2024 organizing committee.

The International Conference on Pattern Recognition (ICPR) is the flagship conference of the International Association of Pattern Recognition and the premier conference in Pattern Recognition, covering Computer Vision, Machine learning, Image, Speech, Sensor Pattern Processing etc. ICPR-2024 is the 27th event of the series which will be held at Kolkata, India during December 1-5, 2024. This conference provides a great opportunity to nurture new ideas and collaborations for students, academics, and industry researchers. ICPR-2024 will cover the following six tracks:

 – Artificial Intelligence, Pattern Recognition and Machine Learning

 – Computer and Robot Vision

 – Image, Speech, Signal and Video Processing

 – Biometrics and Human Computer Interaction

 – Document Analysis and Recognition

 -Biomedical Imaging and Bioinformatics

The main conference has several highlights, including keynotes by top experts, invited talks by academic and industry professionals, oral paper presentation, poster paper presentation, etc.  ICPR-2024 will also have many workshops and Tutorials. 

Prospective authors are invited to submit papers.

Important deadlines:

Paper submission open: January 20, 2024
Paper submission deadline: March 20, 2024
Reviews sent to authors (Acceptance/rejection/Revision): June 20, 2024
Revision/rebuttal submission deadline: July 10, 2024
Final Acceptance notification: August 5, 2024
Camera-ready submission: August 31, 2024

Submission and Review:

ICPR-2024 will follow a single-blind review process. Authors can include their names and affiliations in the manuscript.

Paper Format and Length:

Springer LNCS format with maximum 15 pages (including references) during paper submission. To take care of reviewers’ comments, one more page is allowed (without any charge) during revised/camera ready submission. Moreover, authors may purchase up to 2 extra pages. 

 Springer LNCS paper formatting instructions and templates for ICPR-2024 are available  in the icpr2024 website (https://icpr2024.org/cfp.html)

The  PDF version of the CFP has been attached for your kind reference.  Please visit the ICPR-2024 website icpr2024.org for more details.


For any enquiry please contact the ICPR-2024 Secretariat via email at icpr2024@gmail.com and icpr2024@isical.ac.in

Please also circulate it among your colleagues.

We look forward to your participation in ICPR-2024.

With regards

Umapada Pal, Indian Statistical Institute, Kolkata, India                   

Josef Kittler, University of Surrey, UK

Anil Jain, Michigan State University, USA

 (ICPR-2024 General Chairs)

Rama Chellappa,  Johns Hopkins University, USA

Apostolos Antonacopoulos, University of Salford, UK

Cheng-Lin Liu,  Institute of Automation of Chinese Academy of Sciences, China

Subhasis Chaudhuri, Indian Institute of Technology Bombay, India

 (ICPR-2024 Program Chairs)

12) ICDAR 2027: Call for Hosting Proposals

Deadline: January 31, 2024

Submission Method: Email to the TC10/11 chairs Andreas Fischer andreas.fischer@hefr.ch and Jean-Christophe Burie jean-christophe.burie@univ-lr.fr

The International Conference on Document Analysis and Recognition (ICDAR) is the flagship event of TC10/11. It has been established as a bi-annual conference in 1991 and is now held annually, starting 2023. The aim of ICDAR is to bring together international experts to share their experiences and to promote research and development in all areas of Document Analysis and Recognition. The ICDAR Advisory Board is seeking proposals to host the 21st International Conference on Document Analysis and Recognition, to be held in 2027 (ICDAR 2027).

Any consortium interested in making a proposal to host an ICDAR should first familiarise themselves with the “Guidelines for Organizing and Bidding to Host ICDAR” document which is available on the TC10 and TC11 websites (https://iapr-tc10.univ-lr.fr and http://www.iapr-tc11.org, respectively).

A link to the most current version of the guidelines appears below. Please check on the website of TC11 for the latest version.

Note that the current guidelines still refer to a bi-annual event. An updated version is coming soon that will reflect the change to an annual conference.


The submission of a bid implies full agreement with the rules and procedures outlined in that document.

The submitted proposal must define clearly the items specified in the guidelines (Section 5.2).

It has been the tradition that the location of ICDAR conferences follows a rotating schedule among different continents. Hence, proposals from the Americas are strongly encouraged. However, high quality bids from other locations, for example, from countries where we have had no ICDAR before, will also be considered. Proposals will be examined by the ICDAR Advisory Board.

Proposals should be emailed to the TC10/11 chairs Andreas Fischer andreas.fischer@hefr.ch and Jean-Christophe Burie jean-christophe.burie@univ-lr.fr by January 31, 2024.

ICDAR Advisory Board,
Andreas Fischer (Chair, TC11), Jean-Christophe Burie (Chair, TC10), Anna Esposito (Chair, IAPR C&M), Koichi Kise, Dimosthenis Karatzas, Srirangaraj Setlur

13) Open Call for Organizing DAR Events

The IAPR technical committees on graphics recognition (TC10) and reading systems (TC11) are regularly organizing scientific events for the Document Analysis and Recognition (DAR) community, including the ICDAR flagship conference.

In addition to specific calls for bids to host one of the events, we encourage teams to announce their interest in organizing one of the following events:

  • ICDAR: International Conference on Document Analysis and Recognition (annually; next possibility in 2027)
  • DAS: International Workshop on Document Analysis Systems (satellite event of ICDAR in even years; next possibility in 2024)
  • GREC: International Workshop on Graphics Recognition (satellite event of ICDAR in odd years; next possibility in 2025)
  • SSDA: Summer School on Document Analysis (biannually in odd years; next possibility in 2025)

Anyone interested in hosting one of these events is invited to announce their interest via email to jean-christophe.burie@univ-lr.fr and andreas.fischer@unifr.ch, in order to receive feedback and support for preparing a proposal.

Jean-Christophe Burie (Chair, TC10)
Andreas Fischer (Chair, TC11)

14) Special Issue: “Advanced Topics in Document Analysis and Recognition”

Editorial from Koichi Kise, Richard Zanibbi, Rajiv Jain & Gernot A. Fink:

The ongoing advancement of deep learning techniques, such as the Transformer and Large Language Models, continues to enhance both the accuracy and efficiency of methods in the field of Document Analysis and Recognition, while also broadening its scope. The primary objective of this Special Issue is to keep pace with these developments and showcase the latest advancements in document analysis and recognition. Upholding our tradition established for journal track papers from the ICDAR conference with issues released in 2019 and 2021, we present the third edition of the Special Issue, under the same title.

Despite the persistent effects of the COVID-19 pandemic, we initiated a call for papers, disseminating this widely through the web pages of IJDAR and ICDAR. By November 2022, we received 33 submissions that were deemed relevant to the scope of our work. Each submission was assigned to one of ourselves as guest editor, with careful attention paid to avoiding any conflicts of interest. We procured reviews from experts in the field, adhering to the journal’s standard practices. After a rigorous review process, often involving two or three rounds, we accepted 13 papers for publication in this special issue. These papers reflect both the breadth and depth of current research in the field of Document Analysis and Recognition.

The accepted papers can be grouped into several categories: document image processing and classification (two papers), historical document analysis (four papers), character recognition (one paper), online handwriting analysis (two papers), layout analysis (two papers), and applications (two papers). We will briefly summarize the accepted papers in… See more

15) IJDAR article alert (vol. 26, issue 4)

Volume 26, issue 4, December 2023. Please find below the 5 articles:

Trajectory-based recognition of in-air handwritten Assamese words using a hybrid classifier network
Ananya Choudhury & Kandarpa Kumar Sarma

HWNet v3: a joint embedding framework for recognition and retrieval of handwritten text
Praveen Krishnan, Kartik Dutta & C. V. Jawahar

An end-to-end pipeline for historical censuses processingopen access
Rémi Petitpierre, Marion Kramer & Lucas Rappo

A brief review of state-of-the-art object detectors on benchmark document images datasets
Trong Thuan Nguyen, Hai Le, Truong Nguyen, Nguyen D. Vo & Khang Nguyen

Review of chart image detection and classification
Filip Bajić & Josip Job

16) Job offers – 1 new

Post-doctoral position at the Computer Vision Center, Barcelona

We are seeking two postdoctoral researchers to join the Vision, Language and Reading group at the Computer Vision Center (CVC), in Barcelona, Spain, one focused on FEDERATED LEARNING AND DIFFERENTIAL PRIVACY and another in COMPUTER VISION.

The position is available for a minimum of 2 years and is linked to the “European Lighthouse on Secure and Safe AI” (ELSA), a European Project funded by Horizon Europe and backed by the ELLIS network of excellence. The project covers research topics that include robustness, privacy and human agency and will develop use cases in areas such as autonomous driving, robotics, health and document intelligence.


The candidate should possess a PhD in machine learning or computer vision and have a strong publication record. We are looking for candidates who have publications in top conferences like CVPR, ECCV, ICCV, ICDAR, NeurIPS, ICML, ICLR.

The candidate should have a strong background in machine learning and computer vision. Experience on document image analysis and/or visual question answering would be positive. The applicants are expected to be fluent in both oral and written communication in English. They should work well in a team while demonstrating initiative and independence. The candidate is expected to co-supervise PhD students.

The successful candidate is expected to contribute to the design and development of AI solutions for document understanding, employing privacy preserving techniques and infrastructures set up by the ELSA project.


The selected candidate will work in the Computer Vision Centre (CVC), Barcelona, a research institute comprising more than 130 researchers and support staff, dedicated to computer vision research and knowledge transfer. With a strong international projection and links to the industry, the Computer Vision Centre offers an exciting environment for scientific career development. The Computer Vision Centre has a plan for expansion of its permanent research staff base and has received the “HR Excellence in Research” award as a provider and supporter of a stimulating and favourable working environment.

The direct responsible for these posts will Dr Dimosthenis Karatzas, leading the Vision, Language and Reading research group at the CVC.

Barcelona is a vibrant city and an important Artificial Intelligence hub. The high quality of life is combined with an open and international looking character of the city. Barcelona is very well connected by air, sea and ground transportation. The region of Catalonia boosts its own AI strategy, in which the CVC is a key player.


If you are interested in the position, please contact Dr Dimosthenis Karatzas for more information and applications (dimos@cvc.uab.es)


Apply by filling in the online form at:
Computer Vision: http://www.cvc.uab.es/blog/2023/01/11/postdoc-position-in-computer-vision/
FL and DP: http://www.cvc.uab.es/blog/2023/01/11/postdoc-position-in-computer-vision/


ELSA project: https://elsa-ai.eu/
Computer Vision Center: http://www.cvc.uab.es/
Vision, Language and Reading group: https://www.vlr.ai/

Ph.D Student : Multimodal models for Document Image Understanding, LITIS, France

Link: https://www.litislab.fr/en/job/phd-student-multimodal-models-document-image-understanding


Deep Learning, Vision, OCR, Document Understanding, Natural language processing

Context and main objectives

The digital transformation of libraries, which has been based on OCR (Optical Character Recognition) technology for more than 20 years, faces some limitations both in terms of quality, due to the diversity of the collections and the limitations of OCR technology, and in terms of added value due to a lack of structuring and high-level indexing. Named entity extraction is still little used because it relies on language processing technologies, which were not very adaptable until recently. More generally, the semantic indexing of collections is underdeveloped and integrated with metadata. We propose to develop multimodal models (text + image) for the extraction of information from collections of digitized documents in large libraries. The literature shows that work in this direction is still underdeveloped, and that it is mainly aimed at processing commercial documents (invoices etc…). 

The proposed project aims to disrupt the traditional sequential document processing workflow by combining Vision models and Large Language Models (LLM) to provide a more streamlined and efficient approach. The standard two-stages architectures based on OCR + NER (Optical Character Recognition, Named Entity Recognition) are now giving way to end-to-end multimodal approaches known as Document Understanding, which are more versatile and easily adaptable to new corpora, making it easier and more cost-effective to set up and run document processing projects. As a result, this accessible, user-friendly approach will democratize access to advanced AI technologies for a wider range of institutions, contributing to the evolution of the technology value chain in the Libraries, Archives and Museums (LAM) sector and opening up new opportunities for research and discovery. 

The proposed work program funded by the FINLAM project (Foundation INtegrated models for Libraries Archives and Museum, ANR 2023), relies on the expertise of LITIS to study the most relevant multimodal architectures to integrate the language knowledge conveyed by the large language models developed recently and to study the modalities of specialization/adaptation of these models in conjunction with the learning of a generic optical encoder, benefiting from the annotated collections available at the French national library (Bibliothèque nationale de France –  BnF). User interaction will be considered according to different scenarii of closed and open queries. 


Thierry Paquet,  Thierry.Paquet@univ-rouen.fr   

Pierrick Tranouez, Pierrick.Tranouez@univ-rouen.fr

Clément Chatelain, Clement.Chatelain@insa-rouen.fr