-
A vector-space dynamic feature for phrase-based statistical machine translation
Abstract In this paper, we propose and evaluate a novel dynamic feature function for log-linear model combinations in phrase-based
statistical machine translation. The feature function is inspired on the popularly known vector-space model which is typically
used in information retrieval and text mining applications, and it aims at improving translation unit selection at decoding
time by incorporating context information from the source language. Significant improvements on an English-Spanish experimental
corpus are presented and discussed.
- Content Type Journal Article
- DOI 10.1007/s10844-010-0130-7
- Authors
- Marta R. Costa-jussà, Speech and Language Department, Barcelona Media Innovation Center, Av Diagonal 177, 9th floor, 08018 Barcelona, Spain
- Rafael E. Banchs, Human Language Technology Department, Institute for Infocomm Research, 1 Fusionopolis Way, 21-01, Connexis (South Tower), Singapore, 138632 Singapore
-
Optimizing queries to remote resources
Abstract One key property of the Semantic Web is its support for interoperability. Recent research in this area focuses on the integration
of multiple data sources to facilitate tasks such as ontology learning, user query expansion and context recognition. The
growing popularity of such machups and the rising number of Web APIs supporting links between heterogeneous data providers
asks for intelligent methods to spare remote resources and minimize delays imposed by queries to external data sources. This
paper suggests a cost and utility model for optimizing such queries by leveraging optimal stopping theory from business economics:
applications are modeled as decision makers that look for optimal answer sets. Queries to remote resources cause additional
cost but retrieve valuable information which improves the estimation of the answer set?s utility. Optimal stopping optimizes
the trade-off between query cost and answer utility yielding optimal query strategies for remote resources. These strategies
are compared to conventional approaches in an extensive evaluation based on real world response times taken from seven popular
Web services.
- Content Type Journal Article
- DOI 10.1007/s10844-010-0129-0
- Authors
- Albert Weichselbraun, Vienna University of Economics and Business, Augasse 2-6, 1090 Vienna, Austria
-
Outlier detection by example
Abstract Outlier detection is a useful technique in such areas as fraud detection, financial analysis and health monitoring. Many recent
approaches detect outliers according to reasonable, pre-defined concepts of an outlier (e.g., distance-based, density-based,
etc.). However, the definition of an outlier differs between users or even datasets. This paper presents a solution to this
problem by including input from the users. Our OBE (Outlier By Example) system is the first that allows users to provide examples
of outliers in low-dimensional datasets. By incorporating a small number of such examples, OBE can successfully develop an
algorithm by which to identify further outliers based on their outlierness. Several algorithmic challenges and engineering
decisions must be addressed in building such a system. We describe the key design decisions and algorithms in this paper.
In order to interact with users having different degrees of domain knowledge, we develop two detection schemes: OBE-Fraction
and OBE-RF. Our experiments on both real and synthetic datasets demonstrate that OBE can discover values that a user would
consider outliers.
- Content Type Journal Article
- DOI 10.1007/s10844-010-0128-1
- Authors
- Cui Zhu, College of Computer Science, Beijing University of Technology, Beijing, 100124 People?s Republic of China
- Hiroyuki Kitagawa, Graduate School of Systems and Information Engineering, Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, 305-8577 Japan
- Spiros Papadimitriou, IBM T.J. Watson, Hawthorne, NY USA
- Christos Faloutsos, Carnegie Mellon University, Pittsburgh, PA USA
-
The ESTEEM platform: enabling P2P semantic collaboration through emerging collective knowledge
Abstract In this paper, we present Esteem (Emergent Semantics and cooperaTion in multi-knowledgE EnvironMents), a community-based P2P platform for supporting semantic
collaboration among a set of independent peers, without prior reciprocal knowledge and no predefined relationships. Goal of
Esteem is to go beyond the existing state-of-the-art solutions for P2P knowledge sharing and to provide an integrated platform for
both data and service discovery. A distinguishing feature of Esteem is the use of semantic communities to explicitly give shape to the collective knowledge and expertise of peer groups with similar interests. Key techniques
of Esteem will be presented in the paper and concern: shuffling-based communication, ontology and service matchmaking, context management, and quality-aware data integration. An application example of data and service discovery in the health-care domain will be presented, by also discussing results
of system and user evaluation.
- Content Type Journal Article
- DOI 10.1007/s10844-010-0125-4
- Authors
- Stefano Montanelli, Università degli Studi di Milano - DICo Via Comelico, 39 20135 Milano Italy
- Devis Bianchini, Università degli Studi di Brescia - DEA Via Branze, 38 25123 Brescia Italy
- Carola Aiello, Università di Roma ?La Sapienza? - DIS Via Ariosto, 25 00185 Roma Italy
- Roberto Baldoni, Università di Roma ?La Sapienza? - DIS Via Ariosto, 25 00185 Roma Italy
- Cristiana Bolchini, Politecnico di Milano - DEI Piazza Leonardo da Vinci, 32 20133 Milano Italy
- Silvia Bonomi, Università di Roma ?La Sapienza? - DIS Via Ariosto, 25 00185 Roma Italy
- Silvana Castano, Università degli Studi di Milano - DICo Via Comelico, 39 20135 Milano Italy
- Tiziana Catarci, Università di Roma ?La Sapienza? - DIS Via Ariosto, 25 00185 Roma Italy
- Valeria De Antonellis, Università degli Studi di Brescia - DEA Via Branze, 38 25123 Brescia Italy
- Alfio Ferrara, Università degli Studi di Milano - DICo Via Comelico, 39 20135 Milano Italy
- Michele Melchiori, Università degli Studi di Brescia - DEA Via Branze, 38 25123 Brescia Italy
- Elisa Quintarelli, Politecnico di Milano - DEI Piazza Leonardo da Vinci, 32 20133 Milano Italy
- Monica Scannapieco, Università di Roma ?La Sapienza? - DIS Via Ariosto, 25 00185 Roma Italy
- Fabio A. Schreiber, Politecnico di Milano - DEI Piazza Leonardo da Vinci, 32 20133 Milano Italy
- Letizia Tanca, Politecnico di Milano - DEI Piazza Leonardo da Vinci, 32 20133 Milano Italy
-
Integrating web service and semantic dialogue model for user models interoperability on the web
Abstract Nowadays there is a great number of Web information systems that build a model of the user and adapt their services according
to the needs and preferences maintained by the user model (UM). One of the most challenging issues of this scenario is the
possibility to enable different systems to cooperate in order to exchange the available information about a user. Our aim
is to create rich (and scalable) communication protocols and infrastructures to enable consumers and providers of UM data
to interact. Our solution for dealing with such an issue is to exploit Web standards for interoperability (i.e. Semantic Web
and Web Services) for implementing simple atomic communication, and a dialogue model for implementing enhanced communication
capabilities. In particular, two systems can start a semantics-enhanced Dialogue Game as a form of negotiation to clarify
the meaning of the requested concepts when a shared knowledge model does not exist, and to approximate the response when the
exact one is not available. We propose a distributed semantic conversation framework based on the Sesame semantic environment
for the exchange of user model knowledge on the Web. Systems have to expose their user model data as a Web Service, and to
exploit a public dialogue knowledge base to start the dialogue. The main advantage of the approach is to allow systems to
deal with difficult situations by starting an appropriate dialogue game instead of stopping the communication as in the traditional
?all-or-nothing? Web Service approach. On the basis of a preliminary evaluation, the approach has shown an improvement of
the adaptation results provided by the systems we tested.
- Content Type Journal Article
- DOI 10.1007/s10844-010-0126-3
- Authors
- Federica Cena, University of Turin Department of Computer Science Corso Svizzera 185 10149 Turin Italy
|