Difference between revisions of "Acronym paper"
Jump to navigation
Jump to search
Line 17: | Line 17: | ||
{{Link|target=AcronymHistograms}} | {{Link|target=AcronymHistograms}} | ||
==== WikiCFP ==== | ==== WikiCFP ==== | ||
− | + | ===== Standard case ==== | |
+ | 60% of all WikiCFP acronyms extracted are matching the regular expression | ||
+ | [A-Z]+\s*[12][0-9]{3} e.g. ISWC 2012 | ||
<pre> | <pre> | ||
43990/73731 ( 59.7%) matches for [A-Z]+\s*[12][0-9]{3} | 43990/73731 ( 59.7%) matches for [A-Z]+\s*[12][0-9]{3} |
Revision as of 17:20, 1 March 2023
- Acronym definition see Acronym
Research questions
- What do acronyms for scientific events and event series look like and how formal can they be described?
- How well do acronyms disambiguate scientific events and event series?
- How well is the acronym information curated in metadata sources for events and event series
- How well are acronyms used in citations of scientific events and event series?
- Acronym checker - does the Acronym fit the long version ...
Method
What do acronyms for scientific events and event series look like and how formal can they be described?
- Try regular expressions see Acronym_-_Regular_Expressions
- Check length histograms see https://github.com/WolfgangFahl/ConferenceCorpus/blob/main/tests/testAcronymCategory.py
Results
What do acronyms look like
Length distribution
WikiCFP
= Standard case
60% of all WikiCFP acronyms extracted are matching the regular expression [A-Z]+\s*[12][0-9]{3} e.g. ISWC 2012
43990/73731 ( 59.7%) matches for [A-Z]+\s*[12][0-9]{3} 654/43989 ( 1.5%) year different
- Corner cases:
SELECT acronym
FROM "event_wikicfp"
where length(acronym)=40
The acroynm entries with a length of 40 are mostly not acronyms ...
... Political Theology Agenda Symposium 2010 Knowledge Engineering Special Issue 2010 CFP MapReduce Special Issue of CCPE 2010 AOSD - Student Research Competition 2011 special session for Wireless VITAE 2011 Political Theology Agenda Symposium 2011 12th EANN / 7th AIAI Joint Congress 2011 ...
- Exotic case:
- Outliers: