Laurent Romary

  1. Data formats for phonological corpora.

    Authors: Laurent Romary, Andreas Witt
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    The goal of the present chapter is to explore the possibility of providing
    the research (but also the industrial) community that commonly uses spoken
    corpora with a stable portfolio of well-documented standardised formats that
    allow a high re-use rate of annotated spoken resources and, as a consequence,
    better interoperability across tools used to produce or exploit such resources.

  2. Stabilizing knowledge through standards - A perspective for the humanities.

    Authors: Laurent Romary
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    It is usual to consider that standards generate mixed feelings among
    scientists. They are often seen as not really reflecting the state of the art
    in a given domain and a hindrance to scientific creativity. Still, scientists
    should theoretically be at the best place to bring their expertise into
    standard developments, being even more neutral on issues that may typically be
    related to competing industrial interests.

  3. Comparing Repository Types - Challenges and barriers for subject-based repositories, research repositories, national repository systems and institutional repositories in serving scholarly communication.

    Authors: Laurent Romary, Chris Armbruster
    Subjects: Digital Libraries
    Abstract

    After two decades of repository development, some conclusions may be drawn as
    to which type of repository and what kind of service best supports digital
    scholarly communication, and thus the production of new knowledge. Four types
    of publication repository may be distinguished, namely the subject-based
    repository, research repository, national repository system and institutional
    repository. Two important shifts in the role of repositories may be noted. With
    regard to content, a well-defined and high quality corpus is essential.

  4. Comparing Repository Types - Challenges and barriers for subject-based repositories, research repositories, national repository systems and institutional repositories in serving scholarly communication.

    Authors: Laurent Romary, Chris Armbruster
    Subjects: Digital Libraries
    Abstract

    After two decades of repository development, some conclusions may be drawn as
    to which type of repository and what kind of service best supports digital
    scholarly communication, and thus the production of new knowledge. Four types
    of publication repository may be distinguished, namely the subject-based
    repository, research repository, national repository system and institutional
    repository. Two important shifts in the role of repositories may be noted. With
    regard to content, a well-defined and high quality corpus is essential.

  5. Representing human and machine dictionaries in Markup languages.

    Authors: Laurent Romary, Lothar Lemnitzer, Andreas Witt
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    In this chapter we present the main issues in representing machine readable
    dictionaries in XML, and in particular according to the Text Encoding
    Dictionary (TEI) guidelines.

  6. Standardization of the formal representation of lexical information for NLP.

    Authors: Laurent Romary
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    A survey of dictionary models and formats is presented as well as a
    presentation of corresponding recent standardisation activities.

  7. Standards for Language Resources.

    Authors: Laurent Romary, Nancy Ide
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    The goal of this paper is two-fold: to present an abstract data model for
    linguistic annotations and its implementation using XML, RDF and related
    standards; and to outline the work of a newly formed committee of the
    International Standards Organization (ISO), ISO/TC 37/SC 4 Language Resource
    Management, which will use this work as its starting point.

  8. Communication scientifique : Pour le meilleur et pour le PEER.

    Authors: Laurent Romary
    Subjects: Digital Libraries
    Abstract

    This paper provides an overview (in French) of the European PEER project,
    focusing on its origins, the actual objectives and the technical deployment.

  9. Towards Multimodal Content Representation.

    Authors: Laurent Romary, Harry Bunt
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    Multimodal interfaces, combining the use of speech, graphics, gestures, and
    facial expressions in input and output, promise to provide new possibilities to
    deal with information in more effective and efficient ways, supporting for
    instance: - the understanding of possibly imprecise, partial or ambiguous
    multimodal input; - the generation of coordinated, cohesive, and coherent
    multimodal presentations; - the management of multimodal interaction (e.g.,
    task completion, adapting the interface, error prevention) by representing and
    exploiting models of the user, the domain, the task, the intera

  10. Standards for Language Resources.

    Authors: Laurent Romary, Nancy Ide
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    This paper presents an abstract data model for linguistic annotations and its
    implementation using XML, RDF and related standards; and to outline the work of
    a newly formed committee of the International Standards Organization (ISO),
    ISO/TC 37/SC 4 Language Resource Management, which will use this work as its
    starting point. The primary motive for presenting the latter is to solicit the
    participation of members of the research community to contribute to the work of
    the committee.

  11. A Common XML-based Framework for Syntactic Annotations.

    Authors: Laurent Romary, Nancy Ide, Tomaz Erjavec
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    It is widely recognized that the proliferation of annotation schemes runs
    counter to the need to re-use language resources, and that standards for
    linguistic annotation are becoming increasingly mandatory. To answer this need,
    we have developed a framework comprised of an abstract model for a variety of
    different annotation types (e.g., morpho-syntactic tagging, syntactic
    annotation, co-reference annotation, etc.), which can be instantiated in
    different ways depending on the annotator's approach and goals.

  12. Marking-up multiple views of a Text: Discourse and Reference.

    Authors: Laurent Romary, Dan Cristea, Nancy Ide
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    We describe an encoding scheme for discourse structure and reference, based
    on the TEI Guidelines and the recommendations of the Corpus Encoding
    Specification (CES). A central feature of the scheme is a CES-based data
    architecture enabling the encoding of and access to multiple views of a
    marked-up document. We describe a tool architecture that supports the encoding
    scheme, and then show how we have used the encoding scheme and the tools to
    perform a discourse analytic task in support of a model of global discourse
    cohesion called Veins Theory (Cristea & Ide, 1998).

  13. Dynamically Generated Interfaces in XML Based Architecture.

    Authors: Laurent Romary, Minit Gupta
    Subjects: Other
    Abstract

    Providing on-line services on the Internet will require the definition of
    flexible interfaces that are capable of adapting to the user's characteristics.
    This is all the more important in the context of medical applications like home
    monitoring, where no two patients have the same medical profile. Still, the
    problem is not limited to the capacity of defining generic interfaces, as has
    been made possible by UIML, but also to define the underlying information
    structures from which these may be generated.

  14. Reference Resolution within the Framework of Cognitive Grammar.

    Authors: Laurent Romary, Susanne Salmon-Alt
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    Following the principles of Cognitive Grammar, we concentrate on a model for
    reference resolution that attempts to overcome the difficulties previous
    approaches, based on the fundamental assumption that all reference (independent
    on the type of the referring expression) is accomplished via access to and
    restructuring of domains of reference rather than by direct linkage to the
    entities themselves.

  15. A general XML-based distributed software architecture for accessing and sharing ressources.

    Authors: Laurent Romary, Samuel Cruz-Lara, Patrice Bonhomme, Christophe De Saint-Rat
    Subjects: Software Engineering
    Abstract

    This paper presents a general xml-based distributed software architecture in
    the aim of accessing and sharing resources in an opened client/server
    environment. The paper is organized as follows : First, we introduce the idea
    of a "General Distributed Software Architecture". Second, we describe the
    general framework in which this architecture is used. Third, we describe the
    process of information exchange and we introduce some technical issues involved
    in the implementation of the proposed architecture.

  16. Multiple Retrieval Models and Regression Models for Prior Art Search.

    Authors: Patrice Lopez, Laurent Romary
    Subjects: Computation and Language (Computational Linguistics and Natural Language and Speech Processing)
    Abstract

    This paper presents the system called PATATRAS (PATent and Article Tracking,
    Retrieval and AnalysiS) realized for the IP track of CLEF 2009. Our approach
    presents three main characteristics: 1. The usage of multiple retrieval models
    (KL, Okapi) and term index definitions (lemma, phrase, concept) for the three
    languages considered in the present track (English, French, German) producing
    ten different sets of ranked results. 2. The merging of the different results
    based on multiple regression models using an additional validation set created
    from the patent collection. 3.

Syndicate content