Difference between revisions of "Acronym paper"
Line 3: | Line 3: | ||
E.g. for Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models) | E.g. for Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models) | ||
+ | |||
+ | Usecase: Lookup an event by Acronym e.g. ESWC -> https://scholia.toolforge.org/event-series/Q17012957 | ||
In the process of digitalization of scientific publishing PID have been introduced for quite a few entities such as Papers(DOI), Authors (ORCID), Organizations(ROR) but unfortunately not for scientific events and series where the most common disambiguating identifier is still | In the process of digitalization of scientific publishing PID have been introduced for quite a few entities such as Papers(DOI), Authors (ORCID), Organizations(ROR) but unfortunately not for scientific events and series where the most common disambiguating identifier is still |
Revision as of 18:35, 1 March 2023
- Acronym definition see Acronym
Research questions
E.g. for Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models)
Usecase: Lookup an event by Acronym e.g. ESWC -> https://scholia.toolforge.org/event-series/Q17012957
In the process of digitalization of scientific publishing PID have been introduced for quite a few entities such as Papers(DOI), Authors (ORCID), Organizations(ROR) but unfortunately not for scientific events and series where the most common disambiguating identifier is still acronyms/short names such as ESWC 2023/Semantics '23. (Only very few instances have PIDs DOI (200)/ pseudo PIDs Wikidata Id (9000/1000).
We estimate that some 5000 (dblp)-25.000 and some 50.000 (dblp) to 250.000 events/eventseries that have - public digital traces (as part of their lifecyle) such as homepages, entries in public cfps, library indices for their proceedings, homepages - would still need PIDs.
To create PIDs and enter the metadata in public KGs such as wikidata acronyms look like a promising tool for disambiguation (as has been proven by the Work of Simon Cobb using OpenRefine ...)
- What do acronyms for scientific events and event series look like and how formal can they be described?
- How well do acronyms disambiguate scientific events and event series?
- How well is the acronym information curated in metadata sources for events and event series
- How well are acronyms used in citations of scientific events and event series?
- Acronym checker - does the Acronym fit the long version ...
Method
What do acronyms for scientific events and event series look like and how formal can they be described?
- Try regular expressions see Acronym_-_Regular_Expressions
- Check length histograms see https://github.com/WolfgangFahl/ConferenceCorpus/blob/main/tests/testAcronymCategory.py
Results
What do acronyms look like
Length distribution
WikiCFP
Standard case
60% of all WikiCFP acronyms extracted are matching the regular expression
[A-Z]+\s*[12][0-9]{3}
e.g. ISWC 2012
43990/73731 ( 59.7%) matches for [A-Z]+\s*[12][0-9]{3} 654/43989 ( 1.5%) year different
Corner cases
long acronyms tend to indicate the extraction has not worked or there is some other issue with the acronym such as indicating a joint / colocated situation
SELECT acronym
FROM "event_wikicfp"
where length(acronym)=40
The acroynm entries with a length of 40 are mostly not acronyms ...
... Political Theology Agenda Symposium 2010 Knowledge Engineering Special Issue 2010 CFP MapReduce Special Issue of CCPE 2010 AOSD - Student Research Competition 2011 special session for Wireless VITAE 2011 Political Theology Agenda Symposium 2011 12th EANN / 7th AIAI Joint Congress 2011 ...
Exotic cases / Outliers
There is only one entry in wikicfp where the extracted acronym was longer than 50 chars.
SELECT acronym,url
FROM "event_wikicfp"
where length(acronym)>50
call for chapters - images of female aggression 2016 http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=52302
This is not a call for papers for scientific events at all.