Workdocumentation 2023-12-22

From BITPlan cr Wiki
Jump to navigation Jump to search

Event Signature Extraction task

Based on Event Signature definition in https://doi.org/10.5334/dsj-2022-011

Persistent Identification for Conferences
Data Science Journal
2022-04-05 | Journal article
DOI: 10.5334/dsj-2022-011
Contributors: Julian Franken; Aliaksandr Birukou; Kai Eckert; Wolfgang Fahl; Christian Hauschke; Christoph Lange

Prompt

I'd like to extract event signatures from a couple of hundred home page texts of scientific events - what would be a prompt for you to get the information in structured format such as yaml or json ? Here is the definition of the event signature: Acronym: A short name for the conference often consisting of 3 to 8 upper case
letters trying to be unique but actually often being ambiguous. For instance, ISWC
may refer to the International Semantic Web Conference or to the International
Symposium on Wearable Computing.
Frequency: Annual, biennial, triennial – most events have an annual frequency, and
this is mostly not stated explicitly (not stated explicitly in this example).
Event reach: Target reach of the conference such as ‘International’, ‘European’, ‘East
Asian’ (International).
Event type: Such as Conference, Workshop, Symposium (Conference).
Year: A two or four-digit reference to the year in which the event took place – not to
be confused with the year of publication of the proceedings, which might be different
(2019).
Ordinal: Often used to enumerate the conference series instances (18th).
Date: Start date and end date or date range of the conference (October 26–30).
Location: Description of the location of the conference often consisting of country,
region, and city – sometimes with details about the exact venue. (Auckland, New
Zealand).
Title: The title often contains scope, type, and subject of the conference (International
Semantic Web Conference).
Subject: Description of what the conference is about often prefixed with ‘on’
(Semantic Web).
Delimiters: A variety of syntactic delimiters such as blanks, commas, colons, brackets
are used depending on the citation style

and an example of a homepage text:
QUATIC'1994
    1st International Conference on the Quality of Information and Communications Technology
    Hotel Penta
    27 May, 1994
    Lisbon, Portugal
    Toggle navigation
    PROCEEDINGS |
    DBLP |
    GOOGLE CITATIONS |
    QUATIC CONFERENCE SERIES
    QUATIC'1994
    Committees
    Program
    Conference Venue
    Sponsors
    QUATIC Conference Series

Answer

To extract event signatures from home page texts of scientific events in a structured format like YAML or JSON, you need to prompt the AI to identify and structure specific pieces of information based on the definition of the event signature you provided. The prompt should guide the AI to look for acronyms, frequencies, event reach, types, years, ordinals, dates, locations, titles, and subjects in the text. Here's how you could formulate such a prompt:

I have texts from home pages of scientific events. I need to extract structured information about each event in a clear, concise format. Here are the details I need for each event signature:

- Acronym: The short name of the conference, often in uppercase.
- Frequency: How often the event occurs, like annual or biennial.
- Event reach: The geographical or demographic reach of the event, like International or European.
- Event type: The format of the event, such as Conference, Workshop, or Symposium.
- Year: The year in which the event takes place.
- Ordinal: The instance number of the event, like 18th or 1st.
- Date: The start and end date or date range of the event.
- Location: The country, region, and city of the event, and sometimes specific venue details.
- Title: The full title of the event, often indicating the scope and subject.
- Subject: The main topic or focus of the event.

Given a homepage text of an event, please identify and extract this information in a structured format (either YAML or JSON). Here's an example of a homepage text:

"QUATIC'1994
1st International Conference on the Quality of Information and Communications Technology
Hotel Penta
27 May, 1994
Lisbon, Portugal
Toggle navigation
PROCEEDINGS |
DBLP |
GOOGLE CITATIONS |
QUATIC CONFERENCE SERIES
QUATIC'1994
Committees
Program
Conference Venue
Sponsors
QUATIC Conference Series"

Based on this text, please extract the event signature information and present it in a structured format.

Example Signature

Based on https://ceur-ws.org/Vol-2531/ pointing to https://ase.in.tum.de/seuh20/

Prompt

I'd like to know which of the following event signature elements:
- Acronym: The short name of the conference, often in uppercase.
- Frequency: How often the event occurs, like annual or biennial.
- Event reach: The geographical or demographic reach of the event, like International or European.
- Event type: The format of the event, such as Conference, Workshop, or Symposium.
- Year: The year in which the event takes place.
- Ordinal: The instance number of the event, like 18th or 1st.
- Date: The start and end date or date range of the event.
- Location: The country, region, and city of the event, and sometimes specific venue details.
- Title: The full title of the event, often indicating the scope and subject.
- Subject: The main topic or focus of the event.

are available in the following homepage text:
  #seuh20
    Software Engineering im Unterricht der Hochschulen
    Skip to content
    Start
    Programm
    Keynote
    Registrierung
    Call for Paper
    Organisation
    Mailingliste
    SE 20 Tagung
    SEUH Geschichte
    Tagungsband
    Software Engineering im Unterricht der Hochschulen
    Die SEUH ist seit vielen Jahren das Forum im deutschsprachigen Raum, auf dem Lehrende aus Universitäten, Hochschulen für angewandte Wissenschaften sowie dualen Hochschulen ihre Erfolge, Misserfolge und Erfahrungen in der Software Engineering Ausbildung vorstellen, diskutieren und gemeinsam die Qualität der Lehre verbessern. Der Workshop bietet viel Raum für Diskussionen, sowohl während der Sitzungen als auch in den Pausen. Kurze Vorträge mit anschließenden Diskussionsphasen fördern den Austausch der Teilnehmer. Dies haben die Teilnehmer in der Vergangenheit ausgiebig in lebhaften und konstruktiven Gesprächen genutzt. Viele Lehrende haben von der SEUH entscheidende Impulse für ihre Arbeit erhalten.
    Der Workshop findet zusammen mit der Tagung Software Engineering (SE) der Gesellschaft für Informatik (GI) am 26. und 27. Februar 2020 in Innsbruck, Österreich statt. Tagungsort ist das Hotel Grauer Bär in der Innsbrucker Innenstadt.
    Wir suchen Originalbeiträge, die sich konzeptionell oder empirisch mit der Ausbildung im Software Engineering befassen. Im Fokus stehen neue Erkenntnisse und Erfahrungen in der Vermittlung von Entwicklungsmethoden. Werkzeuge zur Verbesserung der Lehre sind ebenso wollkommen, genauso wie neue Ideen zur Lösung wohlbekannter Probleme in der Ausbildung.
    Weitere Details sind im Call for Paper zu finden. Einreichungsfrist ist der 04.11.2019 (siehe auch wichtige Termine rechts).
    Kontakt
    Stephan Krusche, Technische Universität Münchenkrusche@in.tum.de
    Stefan Wagner, Universität Stuttgartstefan.wagner@iste.uni-stuttgart.de
    Wichtige Termine
    Einreichungsfrist
    04.11.2019 (verlängert)
    Benachrichtigung
    08.12.2019
    Finale Version
    08.01.2020
    SEUH Workshop
    26.-27.02.2020
    Einreichung
    EasyChair
    Follow @seuh20
    Tweets by seuh20
    #seuh20
    Proudly powered by WordPress.


Answer

Based on the provided homepage text, here's the information available for the event signature elements you mentioned:

    Acronym: "SEUH" (found in the text as "#seuh20").
    Frequency: The frequency of the event is not explicitly mentioned in the text.
    Event reach: The geographical or demographic reach of the event is not explicitly mentioned in the text.
    Event type: The event type is mentioned as a "Workshop" and "Tagung" (Conference) in the text.
    Year: The year of the event is mentioned as "26.-27.02.2020" (February 26-27, 2020).
    Ordinal: The instance number of the event is not explicitly mentioned in the text.
    Date: The start and end date of the event are mentioned as "26.-27.02.2020" (February 26-27, 2020).
    Location: The location of the event is mentioned as "Innsbruck, Österreich" (Innsbruck, Austria) at the "Hotel Grauer Bär in der Innsbrucker Innenstadt."
    Title: The full title of the event is "Software Engineering im Unterricht der Hochschulen."
    Subject: The main topic or focus of the event is software engineering education and the exchange of experiences among educators in the German-speaking region.

Please note that some elements like "Frequency," "Event reach," and "Ordinal" are not explicitly mentioned in the provided text, so they may not be applicable to this specific event.

YAML Format I

Prompt

provide the event signature elements:
- Acronym: The short name of the conference, often in uppercase.
- Frequency: How often the event occurs, like annual or biennial.
- Event reach: The geographical or demographic reach of the event, like International or European.
- Event type: The format of the event, such as Conference, Workshop, or Symposium.
- Year: The year in which the event takes place.
- Ordinal: The instance number of the event, like 18th or 1st.
- Date: The start and end date or date range of the event.
- Location: The country, region, and city of the event, and sometimes specific venue details.
- Title: The full title of the event, often indicating the scope and subject.
- Subject: The main topic or focus of the event.

in YAML Format.
use lowercase/underscore for the element names and leave out elements hat are not found. Use ISO date format for dates. Give the location as country/region and city using iso codes
by extracting from the following homepage text:

Answer

Here are the event signature elements extracted from the provided homepage text in YAML format:

event:
  acronym: "SEUH"
  event_type: "Workshop"
  year: "2020-02-26"
  date: "2020-02-26/2020-02-27"
  location:
    country: "AT"  # Austria
    region: "TY"   # Tyrol
    city: "Innsbruck"
  title: "Software Engineering im Unterricht der Hochschulen"
  subject: "Software Engineering education and exchange of experiences"

Please note that some elements like "Frequency," "Event reach," and "Ordinal" are not explicitly mentioned in the provided text, so they are left out of the YAML representation. The location is represented using ISO codes for country and region.

Yaml II

Prompt

provide the event signature elements:
- Acronym: The short name of the conference, often in uppercase.
- Frequency: How often the event occurs, like annual or biennial.
- Event reach: The geographical or demographic reach of the event, like International or European.
- Event type: The format of the event, such as Conference, Workshop, or Symposium.
- Year: The year in which the event takes place.
- Ordinal: The instance number of the event, like 18th or 1st.
- Date: The start and end date or date range of the event.
- Location: The country, region, and city of the event, and sometimes specific venue details.
- Title: The full title of the event, often indicating the scope and subject.
- Subject: The main topic or focus of the event.

in YAML Format.
use lowercase/underscore for the element names and leave out elements hat are not found. Use ISO date format for dates. Use start_date and end_date as field names. Give the year as a 4 digit integer. Give the location as country/region and city using iso codes. Answer with the raw yaml only with no further comments outside the yaml. If you must comment use a comments field.
Extract as instructed from the following homepage text:
    AVICH2022@avi22
    Home
    CFP
    Submission
    Program
    Organization
    Contact
    Welcome to the web site of AVICH 2022, the second Workshop on Advanced Visual Interfaces and Interactions in Cultural Heritage, which will be held
    in conjunction with AVI 2022
    AVICH 2022: Workshop on Advanced Visual Interfaces and Interactions in Cultural Heritage
    Welcome to the web site of AVICH 2022, the second Workshop on Advanced Visual Interfaces and Interactions in Cultural Heritage , which will be held
    in conjunction with AVI 2022 the International Conference on Advanced Visual Interfaces, which will take place at Villa Tuscolana Park Hotel, Frascati, Rome, Italy, June 6th-10th 2022
    Copyright 2012 Andia - All rights reserved. Template by Azmind.

Answer

event:
  acronym: "AVICH 2022"
  event_type: "Workshop"
  year: 2022
  start_date: "2022-06-06"
  end_date: "2022-06-10"
  location:
    country: "IT"  # Italy
    region: "62"   # Lazio
    city: "Frascati"
  title: "Workshop on Advanced Visual Interfaces and Interactions in Cultural Heritage"
  subject: "Advanced Visual Interfaces and Interactions in Cultural Heritage"

VIA API

prompt prefix

        provide the event signature elements:
- Acronym: The short name of the conference, often in uppercase.
- Frequency: How often the event occurs, like annual or biennial.
- Event reach: The geographical or demographic reach of the event, like International or European.
- Event type: The format of the event, such as Conference, Workshop, or Symposium.
- Year: The year in which the event takes place.
- Ordinal: The instance number of the event, like 18th or 1st.
- Date: The start and end date or date range of the event.
- Location: The country, region, and city of the event, and sometimes specific venue details.
- Title: The full title of the event, often indicating the scope and subject.
- Subject: The main topic or focus of the event.

in YAML Format.
use lowercase/underscore for the element names and leave out elements hat are not found. 
Use ISO date format for dates. 
Use start_date and end_date as field names. 
Give the year as a 4 digit integer.
Give the location as country/region and city
Give the country using a 2 digit ISO 3166-1 alpha-2 code
Give the region using ISO_3166-2 code
Give the city with it's english label 
Answer with the raw yaml only with no further comments outside the yaml. If you must comment use a comments field.
Do not add any fields beyond the given list above.
Stick to the requested fields only, and never ever add any extra information.

valid answers e.g. would look like
# AVICH 2022
acronym: "AVICH 2022"
event_type: "Workshop"
year: 2022
start_date: "2022-06-06"
end_date: "2022-06-10"
country: "IT"  # Italy
region: "IT-62"   # Lazio
city: "Frascati"
title: "Workshop on Advanced Visual Interfaces and Interactions in Cultural Heritage"
subject: "Advanced Visual Interfaces and Interactions in Cultural Heritage"

# BMAW 2024
acronym: "BMAW 2014"
frequency: "Annual"
event_reach: "International"
event_type: "Workshop"
year: 2014
country: "NL"  # Netherlands
region: "NL-NH" # Noord-Holland
city: "Amsterdam"
title: "Bayesian Modeling Applications Workshop"
subject: "Bayesian Modeling Applications"
  
Extract as instructed from the following homepage text:

sample answer with Volume added by testcase

- volume: 1565
  acronym: BMAW 2015
  frequency: Annual
  event_type: Workshop
  year: 2015
  start_date: '2015-07-16'
  end_date: '2015-07-16'
  country: NL
  region: NL-NH
  city: Amsterdam
  title: Bayesian Modeling Applications Workshop
  subject: Bayesian Modeling Applications