Difference between revisions of "Acronym paper"
Line 7: | Line 7: | ||
acronyms/short names such as ESWC 2023/Semantics '23. (Only very few instances have PIDs DOI (200)/ pseudo PIDs Wikidata Id (9000/1000). | acronyms/short names such as ESWC 2023/Semantics '23. (Only very few instances have PIDs DOI (200)/ pseudo PIDs Wikidata Id (9000/1000). | ||
− | We estimate that some 5000 (dblp)-25.000 and some 50.000 (dblp) to 250.000 events/eventseries that have public digital traces would still need PIDs | + | We estimate that some 5000 (dblp)-25.000 and some 50.000 (dblp) to 250.000 events/eventseries that have - public digital traces (as part of their lifecyle) such as homepages, entries in public cfps, library indices for their proceedings, homepages - would still need PIDs. |
− | |||
− | |||
+ | To create PIDs and enter the metadata in public KGs such as wikidata acronyms look like a promising tool for disambiguation (as has been proven by the Work of Simon Cobb ...) | ||
# What do acronyms for scientific events and event series look like and how formal can they be described? | # What do acronyms for scientific events and event series look like and how formal can they be described? |
Revision as of 18:33, 1 March 2023
- Acronym definition see Acronym
Research questions
E.g. for Natural Language Processing for/using Knowledge Graphs (e.g. entity linking and resolution using target knowledge such as Wikidata and DBpedia, foundation models)
In the process of digitalization of scientific publishing PID have been introduced for quite a few entities such as Papers(DOI), Authors (ORCID), Organizations(ROR) but unfortunately not for scientific events and series where the most common disambiguating identifier is still acronyms/short names such as ESWC 2023/Semantics '23. (Only very few instances have PIDs DOI (200)/ pseudo PIDs Wikidata Id (9000/1000).
We estimate that some 5000 (dblp)-25.000 and some 50.000 (dblp) to 250.000 events/eventseries that have - public digital traces (as part of their lifecyle) such as homepages, entries in public cfps, library indices for their proceedings, homepages - would still need PIDs.
To create PIDs and enter the metadata in public KGs such as wikidata acronyms look like a promising tool for disambiguation (as has been proven by the Work of Simon Cobb ...)
- What do acronyms for scientific events and event series look like and how formal can they be described?
- How well do acronyms disambiguate scientific events and event series?
- How well is the acronym information curated in metadata sources for events and event series
- How well are acronyms used in citations of scientific events and event series?
- Acronym checker - does the Acronym fit the long version ...
Method
What do acronyms for scientific events and event series look like and how formal can they be described?
- Try regular expressions see Acronym_-_Regular_Expressions
- Check length histograms see https://github.com/WolfgangFahl/ConferenceCorpus/blob/main/tests/testAcronymCategory.py
Results
What do acronyms look like
Length distribution
WikiCFP
Standard case
60% of all WikiCFP acronyms extracted are matching the regular expression
[A-Z]+\s*[12][0-9]{3}
e.g. ISWC 2012
43990/73731 ( 59.7%) matches for [A-Z]+\s*[12][0-9]{3} 654/43989 ( 1.5%) year different
Corner cases
long acronyms tend to indicate the extraction has not worked or there is some other issue with the acronym such as indicating a joint / colocated situation
SELECT acronym
FROM "event_wikicfp"
where length(acronym)=40
The acroynm entries with a length of 40 are mostly not acronyms ...
... Political Theology Agenda Symposium 2010 Knowledge Engineering Special Issue 2010 CFP MapReduce Special Issue of CCPE 2010 AOSD - Student Research Competition 2011 special session for Wireless VITAE 2011 Political Theology Agenda Symposium 2011 12th EANN / 7th AIAI Joint Congress 2011 ...
Exotic cases / Outliers
There is only one entry in wikicfp where the extracted acronym was longer than 50 chars.
SELECT acronym,url
FROM "event_wikicfp"
where length(acronym)>50
call for chapters - images of female aggression 2016 http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=52302
This is not a call for papers for scientific events at all.