FEIII Data Challenge: Financial Entity Identification and Integration Challenge

The Financial Entity Identification and Integration Challenge is a series of challenges to exploit the power of financial Big Data and computational methods over heterogeneous financial collections. A fundamental first-step that has to be overcome, in order to benefit from a wealth of financial Big Data, is the resolution of mentions or references to the same financial entity, across multiple heterogeneous collections.  While the Legal Entity Identifier (LEI) is a good first step to create a unique identifier (and associated data) for each financial entity, there are many essential steps and additional challenges that must be addressed to fully benefit from the adoption of the LEI. The FEIII Challenge broadly encompasses tasks that occur in semi-structured text in the financial domain.  The primary task we will examine is record linkage: Given a set of N reference identifier lists, align identifiers between the lists that refer to the same financial entity.  Further tasks will incorporate the resolution of mentions or references from financial documents. Financial entity identification and integration is a challenge for all financial institutions and for all software companies that provide IT support to these institutions. The FEIII Challenge is sponsored by the Office of Financial Research (OFR) and the National Institutes of Standards and Technology (NIST).

Understanding the Evolution Patterns of Ebola and Other Epidemics

Global epidemic propagation occurs at multiple (local and global) scales: individuals within a subpopulation may be infected through local contacts during a local outbreak. Thus, disease spread simulations require data and models, including social contact networks, local and global mobility patterns of individuals, transmission and recovery rates, and outbreak conditions. Effectively managing the epidemic emergencies through real-time and continuous decision making requires computational models specifically tailored to the spatio-temporal dynamics of epidemics and data- and model-driven computer simulations for their spreading. Epidemic simulations track 100s of inter-dependent parameters, spanning multiple layers and geo-spatial frames, affected by complex dynamic processes operating at different resolutions. Moreover, given the unpredictability of the Ebola epidemic, decision makers need to generate ensembles, with many thousands of simulations, each with different parameters corresponding to different, but plausible, scenarios. These simulations need to be continuously revised based on real-world data as the epidemic and intervention mechanisms evolve. Tools that help running and interpreting epidemic simulation ensembles (aligned with the real-world observations) to generate timely actionable results are critically needed.  The research will result in novel algorithms and tools (namely EpiDMS) specially tailored for officials to continuously assess the impacts of different intervention scenarios and revise estimates based on real world data, at local and global scales, for the Ebola epidemic. The project results will also translate into predictions of the Ebola epidemic's characteristics, including the duration and overall size, and help the global efforts to prevent the disease from turning into a pandemic.

This NSF funded project is a collaboration with the School of Public Health at Georgia State University

KarshaViz: Network and Visual Analytics

Research in finance and economics relies on econometric analysis of time-series datasets. While there is a significant interest to explore promising analytical / computational methodologies including matrix and tensor based methods, ranking and clustering in graphs, visual analytics, etc., there are many limitations. The goal of the KarshaViz project is to create prototype datasets, analytical tools and visualization portals and to provide an open sandbox for multi-disciplinary research and exploration. KarshaViz is collaboration between multiple academic and regulatory researchers

MIMOSA: Knowledge Management for a Smart City

Community Maps 3.0 is a participatory GIS prototype platform enabling communities of interests, governments and generic people to interact with multi-dimensional information spaces for retrieving and uploading geo-referenced information distributed in existing data sources as well as to discuss about the shared contents. MIMOSA extends Community Maps 3.0 with advanced information search facilities in order to enable people to interact with the 3D Community Map and to select the content to be visualized by using a clear and dynamic user interface. We will design and develop a proof of concept semantic engine that will exploit a multimodal domain ontology enabling the expression of the heterogeneous semantic contents related to the application domains, as well as the summaries of the multimedia data and external sources of information, to improve the users’ experience in browsing and searching data. A key component of the semantic engine will be a personalization module, which will combine the available integrated open data, the domain ontology and available private information characterizing users’ profiles and users’ interaction histories in order to manage a personalized shared information space.