Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and the 3rd AI + Informetrics (AII2023)

at the ACM/IEEE Joint Conference on Digital Libraries 2023 (JCDL2023), Santa Fe, New Mexico, USA


News : Call for Paper: Special issue about “AI + Informetrics” (AII) at Jounrnal of Informetrics. More detailed information about this special issue can be visited at:

News : The workshop proceedings of EEKE-AII 2023 are published and you can see at <>.

News: Prof. Scott W. Cunningham (University of Strathclyde in Glasgow) has confirmed our invitation for a keynote in EEKE - AII 2023 and titile of the keynote is: Scientometrics in the Era of Large Language Models.

News: Prof. C. Lee Giles (Pennsylvania State University) has confirmed our invitation for a keynote in EEKE - AII 2023 and titile of the keynote is: Large Language Models for Information Retrieval and Extraction.


Call for Papers

You are invited to participate in the Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and the 3rd AI + Informetrics (AII2023), to be held as part of the ACM/IEEE Joint Conference on Digital Libraries 2023, Santa Fe, New Mexico, USA, June 26 - 30, 2023

Aim of the Workshop

In the era of big data, massive amounts of information and data have dramatically changed human civilization. The broad availability of information provides more opportunities for people, but a new challenge is rising: how can we obtain useful knowledge from numerous information sources. A knowledge entity is a relatively independent and integral knowledge module in a special discipline or a research domain [1]. As a crucial medium for knowledge transmission, scientific documents that contain a large number of knowledge entities attract the attention of scholars [2]. Complementarily, informetrics, known as the study of quantitative aspects of information, has gained great benefits from artificial intelligence (AI), with its capacities in analyzing unstructured scalable data and streams, understanding uncertain semantics, and developing robust and repeatable models. Incorporating informetrics with AI techniques has demonstrated enormous success in turning big data into big value and impact. For example, deep learning approaches enlighten studies of pattern recognition and further leverage time series to track technological change. However, how to effectively cohere the power of AI and informetrics to create cross-disciplinary solutions is still elusive from neither theoretical nor practical perspectives.

This workshop aims to engage related communities in open problems in the extraction and evaluation of knowledge entities from scientific documents and AI + Informetrics. Specifically, knowledge entities in scientific documents may include method entities, tasks, dataset and metrics, software and tools, etc [3]. Knowledge entity application includes the construction of a knowledge entity graph and roadmap, modeling functions of knowledge entity citations, etc. There are some online platforms based on knowledge entities, e.g., SAGE Research Methods and ‘SOTA’ project. In parallel, this workshop also targets certain unsolved issues in AI + Informetrics and a wide range of its practical scenarios including: Cohering AI and informetrics to fulfill cross-disciplinary gaps from either theoretical or practical perspectives; elaborating AI-empowered informetric models with enhanced capabilities in robustness, adaptability, and effectiveness, and leveraging knowledge, concepts, and models in information management to strengthen the interpretability of AI + Informetrics to adapt to empirical needs in real-world cases [4].
This joint workshop entitles these two cutting-edge and cross-disciplinary directions as:

  • Extraction and Evaluation of Knowledge Entity (EEKE), highlighting the development of intelligent methods for identifying knowledge entities from scientific documents, and promoting their application in broad information studies.
  • AI + Informetrics (AII), emphasizing endeavors in interacting AI and informetrics by constructing fundamental theories, developing novel methodologies, bridging conceptual knowledge with practical uses, and creating real-word solutions.
This workshop is to gather researchers and practical users to open a collaborative platform for exchanging ideas, sharing pilot studies, and scoping future directions on this cutting-edge venue.


Workshop Topics

This workshop is primarily designed for academic researchers in broad information and library sciences, science of science, artificial intelligence, and will also be of interest to librarians, ST&I administrators and policymakers, and practitioners in any related sectors.
We invite stimulating research on topics including, but not limited to, methods of knowledge entity extraction and applications of knowledge entity. Specific examples of fields of interest include:

  • Task and methodology from scientific documents
  • Model and algorithmize entity extraction from scientific documents
  • Dataset and metrics mention extraction from scientific documents
  • Software and tool extraction from scientific documents
  • Knowledge entity summarization
  • Relation extraction of knowledge entity
  • Modeling function of knowledge entity citation
  • Informetrics with machine learning (including deep learning)
  • Informetrics with natural language processing or computational linguistics
  • Informetrics with computer vision
  • Informetrics with other related AI techniques (e.g., information retrieval)
  • AI for science of science
  • AI for science, technology, & innovation
  • AI for research policy and strategic management
  • Application of knowledge entity extraction
  • Applications of AI-empowered informetrics



1st Keynote: Scientometrics in the Era of Large Language Models

Abstract: This keynote concerns the emergence of large language models, most recently and popularly embodied ChatGPT. These models have been trained using deep learning algorithms on a massive corpus of text data collected from sources across the web. These models demand a fundamental change in the conduct of scientometric research. In this speech I describe why I think such changes are necessary, and how such research changes may be enacted.

Dr. Scott W. Cunningham is a Professor of Technology Policy at the University of Strathclyde in Glasgow, UK. He is one of the editor-in-chief of the scientific journal Technological Forecasting and Social Change. He is interested in data and governance, and has published both on data use in the public sector, as well as the governance of new technologies including machine learning. Prior to joining Strathclyde he was appointed at the Delft University of Technology. Prior to his academic career he worked as a data scientist in industry, earning national and international patents for his work on automation and pricing. He has previously published books on technology mining, technology management, and technology forecasting with Alan Porter (professor emeritus, Georgia Tech). His current book, commissioned by Edward Elgar (2025) concerns the politics of technological change.

2nd Keynote: Large Language Models for Information Retrieval and Extraction

Abstract: Large language models (LLMs) have changed AI and in many ways determine what AI is and can do. Though there have been many applications and uses of LLMs, we will discuss what LLMs mean for information retrieval and extraction. Already an LLM is used in Google ranking. How else can LLMs be used in these areas and what can and can’t they do.

Dr. C. Lee Giles is the David Reese Professor in Information Sciences and Technology, graduate college Professor of Computer Science and Engineering, courtesy Professor of Supply Chain and Information Systems, and Director of the Intelligent Systems Research laboratory at the Pennsylvania State University, University Park, PA. His research areas include intelligent information processing systems, text processing, deep learning, digital libraries, and novel research search engines. He was a cocreater and now director of the scholarly digital library search engine, CiteSeer. He is a Fellow of the ACM, IEEE, and International Neural Network Society (INNS). He is the recipient of the INNS Gabor prize, IEEE CIS Neural Networks Pioneer Award, NFAIS Miles Conrad Award, and twice the IBM Distinguished Faculty Award. He has published over 600 peer reviewed conference papers and journal articles with collaborators and students, including publications in Science, Nature, PLoS ONE, Proceedings of the National Academy of Sciences, and many other conference and journal venues. His publications have over 57,000 citations and an h-index of 115, according to Google Scholar.. He can be reached at:


The workshop will be held on June 26~27, 2023 (Beijing Time), and specific activities include keynotes, paper presentations and a poster session.

June, 6:30am - 12: 20am (June 26, 8:30 pm ~ June 27, 2:20 am, Beijing Time)
UTC/GMT-6 Beijing Time   Speaker Session Chair
6:00-6:30, (June 26) 20:00-20:30, (June 26) Connection setup: we will provide details
6:30-6:40 20:30-20:40 Openning Remarks  (Online) Co-Chairs of EEKE-AII2023 (Chengzhi Zhang, Yi Zhang, Philipp Mayr, Wei Lu, Arho Suominen, Haihua Chen, and Ying Ding)
6:40-7:10 20:40-21:10 Session 1: Poster (Online) Chair: Hongshu Chen
6:40-6:50 20:40-20:50 An Approach for Identifying Complementary Patents Based on Deep Learning
Jinzhu Zhang and Jialu Shi
6:50-7:00 20:50-21:00 Functional Structure Recognition of Scientific Documents in Information Science  Dayu Yan, Si Shen and Dongbo Wang
7:00-7:10 21:00-21:10 Linkages among Science, Technology, and Industry Shuo Xu, Zhen Liu and Xin An
7:10-8:00 21:10-22:00 Session 2: Entity Extraction and Applications  (Online) Chair: Yingyi Zhang
7:10-7:30 21:10-21:30 The Relationship of Interdisciplinarity, Entity Features and Clinical Translation Potential of COVID-19 Papers Shuang Chen and Chunli Liu
7:30-7:45 21:30-21:45 LLM-based Entity Extraction Is Not for Cybersecurity Maxime Würsch, Andrei Kucharavy, Dimitri Percia David and Alain Mermoud
7:45-8:00 21:45-22:00 Characterizing Emerging Technologies of Global Digital Humanities Using Scientific Method Entities
Shaojian Li and Chengxi Yan
8:05-9:05 21:45-22:00 Keynote 1: Scientometrics in the Era of Large Language Models  (Online presentation + Onsite)   Scott W. Cunningham Chair: Yi Zhang
9:10-10:15 23:10(June 26)-00:15(June 27) Session 3: AI + Informetrics (Online presentation + Onsite) Chair: Mengjia Wu
9:10-9:30 23:10-23:30 Identifying Potential Sleeping Beauties Based on Dynamic Time Warping Algorithm and Citation Curve Benchmarking 
Zewen Hu, Yu Chen and Jingjing Cui. 
9:30-9:45 23:30-23:45 Scientific knowledge combination in networks: new perspectives on analyzing knowledge absorption and integration 
Hongshu Chen, Jingkang Liu and Zikai Liu. 
9:45-10:00 23:45-24:00 UnScientify: Detecting Scientific Uncertainty in Scholarly Full Text  Panggih Kusuma Ningrum, Philipp Mayr and Iana Atanassova.  
10:00-10:15 0:00-0:15(June 27) Sentiment Classification of Scientific Citation Based on Modified BERT Attention by Sentiment Dictionary Dahai Yu and Bolin Hua
10:20-11:2 00:20-01:20 (June 27) Keynote 2: Large Language Models and Information Extraction   (Online presentation + Onsite) C. Lee Giles  Chair:  Jian Wu
11:25-12:15(June 27) 01:25-02:15(June 27) Session 4: EEKE + AII Onsite Session (Onsite presentation + Online) Chair:  Haihua Chen
11:25-11:45 01:25-01:45 ClaimDistiller: Scientific Claim Extraction with Supervised Contrastive Learning Xin Wei, Md Reshad Ul Hoque, Jian Wu, Jiang Li
11:45-12:05 01:45-02:05 Forecasting Future Topic Trends in the Blockchain Domain: Using Graph Convolutional Network Yejin Park, Seonkyu Lim, Changdai Gu and Min Song
12:05-12:15 02:05-02:15 How does AI assist scientific research domains? Evidence based on 26 millions research articles Qianqian Xu, Jie Meng, Jiangen He and Wen Lou. 
12:15-12:20 02:15-02:20 Greeting Notes of EEKE2022 Co-Chairs of EEKE-AII2023 (Chengzhi Zhang, Yi Zhang, Philipp Mayr, Wei Lu, Arho Suominen, Haihua Chen, and Ying Ding)
12:20 02:20 End of workshop


Submission Information

Regular papers: All submissions must be written in English, following the ACM Proceedings template (10 pages for full papers and 4 pages for short papers exclusive of unlimited pages for references) and should be submitted as PDF files to EasyChair.

Poster & demonstration: We welcome submissions detailing original, early findings, works in progress and industrial applications of knowledge entities extraction ande evaluation for a special poster session, possibly with a 2-minute presentation in the main session. Some research track papers will also be invited to the poster track instead, although there will be no difference in the final proceedings between poster and research track submissions. These papers should follow the same format as the research track papers but can be shorter (2 pages for poster and demo papers).

Submit a paper

All submissions will be reviewed by at least two independent reviewers. Please be aware of the fact that at least one author per paper needs to register for the workshop and attend the workshop to present the work. In case of no-show the paper (even if accepted) will be deleted from the proceedings and from the program.

Workshop proceedings will be deposited online in the CEUR workshop proceedings publication service. This way the proceedings will be permanently available and citable (digital persistent identifiers and long term preservation).

Special Issue

Accepted submissions will be invited to submit to our special issue in Technological Forecasting and Social Change or Journal of Informetrics. More detailed information about this special issue can be visited at:


Important Dates

All dates are Anywhere on Earth (AoE).

Deadline for submission: May, 10, 2023
Notification of acceptance: June, 1, 2023
Camera ready: June, 20, 2023
Workshop: June, 26, 2023


Main Organising Committee

Chengzhi Zhang ( is a professor of Department of Information Management, Nanjing University of Science and Technology, China. He received his PhD degree of Information Science from Nanjing University, China. He has published more than 100 publications, including JASIST, Aslib JIM, JOI, OIR, SCIM, ACL, NAACL, etc. His current research interests include scientific text mining, knowledge entity extraction and evaluation, social media mining. He serves as Editorial Board Member and Managing Guest Editor for 10 international journals (Patterns, IPM, OIR, Aslib JIM, TEL, JDIS, DIM, DI, etc.) and PC members of several international conferences in fields of natural language process and scientometrics. (

Yi Zhang ( works as a Senior Lecturer at the Australian Artificial Intelligence Institute, University of Technology Sydney. He holds dual Ph.D. degrees in Management Science & Engineering and in Software Engineering. His research interests align with intelligent bibliometrics - incorporating artificial intelligence and data science techniques with bibliometric indicators for broad science, technology & innovation studies. He is the recipient of the 2019 Discovery Early Career Researcher Award granted by the Australian Research Council. He serves as the Associate Editor for Technol. Forecast. & Soc. Change, the Editorial Board Member for the IEEE Trans. Eng. Manage., and the Advisory Board Member for the International Center for the Study of Research. (

Philipp Mayr ( is a team leader at the GESIS - Leibniz-Institute for the Social Sciences department Knowledge Technologies for the Social Sciences (WTS). He received his PhD in applied informetrics and information retrieval from the Berlin School of Library and Information Science at Humboldt University Berlin. He has published in top conferences and prestigious journals in the areas informetrics, information retrieval and digital libraries. His research group focuses on methods and techniques for interactive information retrieval and data set search. He was the main organizer of the BIR workshops at ECIR 2014-2021 and the BIRNDL workshops at JCDL 2016 and SIGIR 2017-2019. (

Wei Lu ( is a professor of School of Information Management and director of Information Retrieval and Knowledge Mining Center, Wuhan University. He received his PhD degree of Information Science from Wuhan University, China. His current research interests include information retrieval, text mining, QA etc. He has papers published on SIGIR, Information Sciences, JASIT, Journal of Information Science etc. He serves as diverse roles (e.g., Associate Editor, Editorial Board Member, and Managing Guest Editor) for several journals. (

Arho Suominen ( is Principal Scientist at the VTT Technical Research Centre of Finland and Industrial professor at Tampere University (Finland). Dr. Suominen’s research focuses on qualitative and quantitative assessment of innovation systems with a special focus on quantitative methods. His prior research has been funded by the European Commission via H2020, Academy of Finland, Finnish Funding Agency for Technology, Turku University Foundation and the Fulbright Center Finland. Through the Fulbright program, he worked as Visiting Scholar at the School of Public Policy at the Georgia Institute of Technology. Dr. Suominen has a Doctor of Science (Tech.) degree from the University of Turku and holds an Officers basic degree from the National Defence University of Finland. (

Haihua Chen ( a clinical assistant professor in the Department
of Information Science at the University of North Texas. He has expertise in applied data science, natural language processing, information retrieval, and text mining. He co-authored more than 40 publications in academic venues in both information science and computer science. He is serving as co-editor for The Electronic Library, the guest editor of Information Discovery & Delivery and Frontiers in Big Data special issues, and the reviewer for 14 peer reviewed journals and several international conferences. (

Ying Ding ( Bill & Lewis Suit Professor at School of Information, University of Texas at Austin. She has been involved in various NIH, NSF and European-Union funded projects. She has published 240+ papers in journals, conferences, and workshops, and served as the program committee member for 200+ international conferences. She is the co-editor of book series called Semantic Web Synthesis by Morgan & Claypool publisher, the co-editor-in-chief for Data Intelligence published by MIT Press and Chinese Academy of Sciences, and serves as the editorial board member for several top journals in Information Science and Semantic Web. Her current research interests include data-driven science of science, AI in healthcare, Semantic Web, knowledge graph, data science, scholarly communication, and the application of Web technologies. (


Programme Committee

  • Alireza Abbasi, University of New South Wales (Canberra)
  • Andrea Scharnhorst, DANS-KNAW
  • Iana Atanassova, CRIT, Université de Bourgogne Franche-Comté
  • Marc Bertin, Université Claude Bernard Lyon 1
  • Katarina Boland, GESIS - Leibniz Institute for the Social Sciences
  • Yi Bu, Peking University
  • Guillaume Cabanac, IRIT - Université Paul Sabatier Toulouse 3
  • Caitlin Cassidy, Search Technology Inc
  • Chong Chen, Beijing Normal University
  • Hongshu Chen, Beijing Institute of Technology
  • Gong Cheng, Nanjing University
  • Jian Du, Peking University
  • Edward Fox, Virgina Tech
  • Ying Guo, China University of Political Science and Law
  • Arash Hajikhani, VTT Technical Research Centre of Finland
  • Saeed-Ul Hassan, Information Technology University
  • Jiangen He, The University of Tennessee
  • Zhigang Hu, Dalian University of Technology
  • Bolin Hua, Peking University
  • Ying Huang, Wuhan University
  • Yuya Kajikawa, Tokyo University of Technology
  • Vivek Kumar Singh, Banaras Hindu University, Varanasi, U.P., India
  • Chenliang Li, Wuhan Univerisity
  • Kai Li, Renmin University of China
  • Chao Lu, Hohai University
  • Shutian Ma, Tencent
  • Jin Mao, Wuhan University
  • Xianling Mao, Beijing Institute of Technology
  • Chao Min, Nanjing University
  • Wolfgang Otto, GESIS - Leibniz-Institute for the Social Sciences
  • Xuelian Pan, Nanjing University
  • Dwaipayan Roy, GESIS - Leibniz-Institute for the Social Sciences
  • Philipp Schaer, TH Köln (University of Applied Sciences)
  • Mayank Singh, Indian Institute of Technology Gandhinagar
  • Bart Thijs, ECOOM, MSI, K.U.Leuven
  • Suppawong Tuarob, Mahidol University
  • Dongbo Wang, Nanjing Agricultural University
  • Xuefeng Wan,g Beijing Institute of Technology
  • Yuzhuo Wang, Nanjing University of Science and Technology
  • Dietmar Wolfram, University of Wisconsin-Milwaukee
  • Jian Wu, Old Dominion University
  • Mengjia Wu, University of Technology Sydney
  • Tianxing Wu, Southeast University
  • Xiaolan Wu, Nanjing Normal University
  • Yanghua Xiao, Fudan University
  • Jian Xu, Sun Yat-sen university
  • Shuo Xu, Beijing University of Technology
  • Erjia Yan, Drexel University
  • Heng Zhang, Nanjing University of Science and Technology
  • Jinzhu Zhang, Nanjing University of Science and Technology
  • Xiaojuan Zhang, Southwest University
  • Yingyi Zhang, Soochow University
  • Zhixiong Zhang, National Science Library, Chinese Academy of Sciences
  • Qingqing Zhou, Nanjing Normal University
  • Yongjun Zhu, Yonsei University


  1. Chang, X., Zheng, Q. (2008). Knowledge Element Extraction for Knowledge-Based Learning Resources Organization. In: Leung, H., Li, F., Lau, R., Li, Q. (eds) Advances in Web Based Learning – ICWL 2007. ICWL 2007. Lecture Notes in Computer Science, vol 4823. Springer, Berlin, Heidelberg.
  2. Ying, D., Min, S., Jia, H., Qi, Y., Erjia, Y., Lili, L., Tamy, C. entitymetrics: measuring the impact of entities. Plos One, 2013, 8(8), e71416.
  3. Zhang, C., Mayr, P., Lu, W., & Zhang, Y. (2022). JCDL2022 workshop: extraction and evaluation of knowledge entities from scientific documents (EEKE2022). In Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries (JCDL '22). Association for Computing Machinery, New York, NY, USA, Article 54, 1–2.
  4. Zhang, Y., Zhang, C., Mayr, P., & Suominen, A. An editorial of “AI + informetrics”: multi-disciplinary interactions in the era of big data. Scientometrics 127, 6503–6507 (2022).


Past Proceedings & Journal Special Issues

Proceedings can be accessed at We have organized three special issues on the topic of extraction and evaluation of knowledge entities in the Journal of Data and Information Science, Data and Information Management, Aslib Journal of Information Management and Scientometrics respectively. The first AII workshop was held at iConference2021 and the second was held at IP&MC 2022 Annual Conference. We have organized two special issues on the topic of AI + Informetrics in the Scientometrcis and Information Processing and Managerment respectively.