My Research and Publication Platforms

April 20, 2016April 30, 2016 cltmadminLeave a comment

My Journal Platforms of Interest

Computational Social Networks

Focus on common principles, algorithms and tools that govern network structures/topologies, network functionalities, security and privacy, network behaviors, information diffusions and influence, social recommendation systems which are applicable to all types of social networks and social media. Topics include (but are not limited to) the following:

Social network design and architecture
Mathematical modeling and analysis
Real-world complex networks
Information retrieval in social contexts, political analysts
Network structure analysis
Network dynamics optimization
Complex network robustness and vulnerability
Information diffusion models and analysis
Security and privacy
Searching in complex networks
Efficient algorithms
Network behaviors
Trust and reputation
Social Influence
Social Recommendation
Social media analysis
Big data analysis on online social networks

Journal of Big Data

The journal examines the challenges facing big data today and going forward including, but not limited to: data capture and storage; search, sharing, and analytics; big data technologies; data visualization; architectures for massively parallel processing; data mining tools and techniques; machine learning algorithms for big data; cloud computing platforms; distributed file systems and databases; and scalable storage systems.

Open article collectionRead More »

Embedded Scientific Computing

April 7, 2016April 8, 2016 cltmadminLeave a comment

Data-Driven Business and Open Science

Using OpenCPU for integrating scientific computing into the next generation of systems and applications.

Methods for scientific computing are traditionally implemented in specialised software packages such as R or STATA. However, many users and organisations wish to integrate statistical computing into third party software. Ans so, rather than working in a specialised statistical environment, methods to analyse and visualise data get incorporated into pipelines, web applications and big data infrastructures.

OpenCPU is a software system for embedded statistical computation and reproducible research. The server exposes a web API interfacing R, Latex and Pandoc. This API is used for example to integrate statistical functionality into systems, share and execute scripts or reports on centralized servers, and build R based “apps“.

OpenCPU app is an R package which includes some web page(s) that call the R functions in the package using the OpenCPU API thereby making a convenient way to develop, package and ship portable, standalone R web applications.

Research

focuses on domain specific challenges related to integrating scientific computing into the next generation of systems and application.

Contact

Jeroen Ooms: postdoc research

Automatic Recognition of Product Mentions in Text Corpora

April 7, 2016April 7, 2016 cltmadminLeave a comment

Kaggle Competition

Identify product mentions within a largely user-generated web-based corpus and disambiguate the mentions against a large product catalog.

Challenge

to automatically identify all mentions of consumer products in a largely user-generated collection of web content, and to correctly identify the product(s) that each product mention refers to from a large catalog of products.

Dataset

hundreds of thousands of text items, a product catalog with over fifteen million products, and hundreds of manually annotated product mentions supporting data-driven approaches.

Evaluation

submission of disambiguated product mentions will be scored based on the mean F1 correctness metric.
rules

Winners:

1st Zhanpeng Fang

Solution documentation and interview: C, Python and Perl
Paper: Accurate Product Name Recognition from User Generated Content
Slide

2nd: Olexandr Topchylo

documentation: C++

My Favourite Big Data and Machine Learning Startups

April 1, 2016May 2, 2016 cltmadminLeave a comment

Neokami

AI platform for solving data security problems by leveraging next generation machine learning algorithms.

“Neokami’s CyberVault product enables companies to discover, secure and govern sensitive data in the cloud, on premise or across their physical assets.”

Use cases: social media monitoring, image analysis, Customer Relationship Management (CRM).
- real-time emotional heat map.
Neokami Inc. recently participated in the Forecast sales using store, promotion, and competitor data Kaggle competition and was ranked 3rd.
Careers at Neokami

Enigma

Enigma bridges the gap between data and decisions. Enigma helps organizations and individuals fuse, organize, and explore data to make smarter decisions.

Big Data Analytics Hub

Big data: research and practice

Category: Big Data