My Research and Publication Platforms

My Journal Platforms of Interest

Computational Social Networks

Focus on common principles, algorithms and tools that govern network structures/topologies, network functionalities, security and privacy, network behaviors, information diffusions and influence, social recommendation systems which are applicable to all types of social networks and social media. Topics include (but are not limited to) the following:

  • Social network design and architecture
  • Mathematical modeling and analysis
  • Real-world complex networks
  • Information retrieval in social contexts, political analysts
  • Network structure analysis
  • Network dynamics optimization
  • Complex network robustness and vulnerability
  • Information diffusion models and analysis
  • Security and privacy
  • Searching in complex networks
  • Efficient algorithms
  • Network behaviors
  • Trust and reputation
  • Social Influence
  • Social Recommendation
  • Social media analysis
  • Big data analysis on online social networks

Journal of Big Data

The journal examines the challenges facing big data today and going forward including, but not limited to: data capture and storage; search, sharing, and analytics; big data technologies; data visualization; architectures for massively parallel processing; data mining tools and techniques; machine learning algorithms for big data; cloud computing platforms; distributed file systems and databases; and scalable storage systems.

Open article collectionRead More »

Embedded Scientific Computing

Data-Driven Business and Open Science

Using OpenCPU for integrating scientific computing into the next generation of systems and applications.

Methods for scientific computing are traditionally implemented in specialised software packages such as R or STATA. However, many users and organisations wish to integrate statistical computing into third party software. Ans so, rather than working in a specialised statistical environment, methods to analyse and visualise data get incorporated into pipelines, web applications and big data infrastructures.

OpenCPU is a software system for embedded statistical computation and reproducible research. The server exposes a web API interfacing R, Latex and Pandoc. This API is used for example to integrate statistical functionality into systems, share and execute scripts or reports on centralized servers, and build R based “apps“.

OpenCPU app is an R package which includes some web page(s) that call the R functions in the package using the OpenCPU API thereby making a convenient way to develop, package and ship portable, standalone R web applications.

Research

  • focuses on domain specific challenges related to integrating scientific computing into the next generation of systems and application.

Contact

Automatic Recognition of Product Mentions in Text Corpora

Kaggle Competition

Identify product mentions within a largely user-generated web-based corpus and disambiguate the mentions against a large product catalog.

Challenge

  • to automatically identify all mentions of consumer products in a largely user-generated collection of web content, and to correctly identify the product(s) that each product mention refers to from a large catalog of products.

Dataset

  • hundreds of thousands of text items, a product catalog with over fifteen million products, and hundreds of manually annotated product mentions supporting data-driven approaches.

Evaluation

 Winners:

1st Zhanpeng Fang

2nd: Olexandr Topchylo

My Favourite Big Data and Machine Learning Startups

Neokami

AI platform for solving data security problems by leveraging next generation machine learning algorithms.

“Neokami’s CyberVault product enables companies to discover, secure and govern sensitive data in the cloud, on premise or across their physical assets.”

Enigma

Enigma bridges the gap between data and decisions. Enigma helps organizations and individuals fuse, organize, and explore data to make smarter decisions.

Read More »