best of ai january 2018
February 6, 2018

The Best of AI: New Articles Published This Month (January 2018)

10 data articles handpicked by the Sicara team, just for you

Welcome to the 1st edition of the year of our best and favorite articles in AI that were published this month. We are a Paris-based company that does Agile data development. This month, we spotted articles about Natural language processing, MNIST, Blockchain, ensemble methods and more. We advise you to have a Python environment ready if you want to follow some tutorials :). Let’s kick off with the comic of the month:

machine saying nice
You could make a pretty good version of this by just having the machine say ‘NICE’ every time you make any kind of purchase.

1 — What’s the difference between data science, machine learning, and artificial intelligence?

machine learning system comic

To start, an article on the differences between AI, Datascience and Machine learning. The main idea I keep from this article is that “artificial intelligence produces actions”. This made me recall a TED talk I watched a few years back. A scientist explains how a certain sea squirts uses its nervous system to move around at the beginning of its life. And once it has settled on a rock, it doesn’t need its brain anymore, so it just eats it … Food for the mind !

Read What’s the difference between data science, machine learning, and artificial intelligence? — from David Robinson

2 — A Tour of The Top 10 Algorithms for Machine Learning Newbies

algorythm

This article makes a pass on ten common machine learning algorithms. The author doesn’t go into detail and I was already familiar with most of them, but it made me discover the learning vector quantization algorithm so maybe you’ll learn something new too.

Read A Tour of The Top 10 Algorithms for Machine Learning Newbies — from James Le

3 — Exploring handwritten digit classification: a tidy analysis of the MNIST dataset

mnist dataset

The next one really talked to me: it’s about exploration of the handwritten digits MNIST dataset. I had the occasion of benchmarking some models on this dataset, and this post really made the data talk. I really enjoyed the simple steps taken by the author in order to detect where the ML algo would encounter difficulties even before applying any model.

Read Exploring handwritten digit classification: a tidy analysis of the MNIST dataset — from David Robinson

4 — Do algorithms reveal sexual orientation or just expose our stereotypes?

composite heterosexual and gay faces

I don’t know if you remember the paper that claimed using deep learning to identify if you were gay or straight from your facial features? This article written by research scientists from Google AI Group analyses the dataset used in this study and the conclusions are extremely insteresting! It really made me think about the way I interpret my ML results, and how easy it is to have a biased dataset.

Read Do algorithms reveal sexual orientation or just expose our stereotypes? — from Blaise Aguera y Arcas

5 — How to solve 90% of NLP problems: a step-by-step guide

solve problem

This one may be my favorite this month, although I may be biased because it’s written by my engineering school flatmate :). It takes you progressively around the major NLP algorithms and explains how to interpret your models’ results.

By the way, we’re preparing a blog article on training Named Entity Recognition on foreign languages. Don’t forget to follow us!

Read How to solve 90% of NLP problems: a step-by-step guide — from Emmanuel Ameisen

6 — The Convergence of AI and Blockchain: What’s the deal?

brain

The last couple of months have been tumultuous month in the cryptocurrency world, and a lot of questions were raised concerning the blockchain. The author shares his views on the possible intersections of AI and Blockchain, and he raises some intriguing points about data marketization and decentralized intelligence. What if all the energy used by miners were used to train ML algorithm instead of “dummily” computing hashes? Makes you dream no?

By the way if the blockchain concept is still a little bit obscure to you, check this really cool article where the author codes a blockchain in python from scratch.

Read The Convergence of AI and Blockchain: What’s the deal? — from Francesco Corea

7 — How to build your own AlphaZero AI using Python and Keras

alphazero

A lot of great articles have been published on Apha Go and Alpha Go Zero in the past months, but none of them gave me as many insights as this one. The author implements the Alpha Zero model on the game of connect4, and as always concrete code example is worth a thousand word!

Read How to build your own AlphaZero AI using Python and Keras — from David Foster

8 —Quantum Machine Learning: An Overview

quantum machine learning

Have you heard about quantum computing? What about quantum machine learning? If not don’t worry this article gives you a little refresher and discusses what quantum computing could bring to ML and vice versa.

Read Quantum Machine Learning: An Overview — from Reena Shaw

9 — Migrating to Python 3 with pleasure

python

Python 2 is dead, long live Python 3! Well not quite yet but most of the data science tools (pandas, numpy… ) will stop supporting python 2 in 2020. It’s a good time to have a pass on all the differences between the two. I even learned some tricks that I will definitely use from now on!

Read Migrating to Python 3 with pleasure — from Alex Rogozhnikov

10 — Introduction to Python Ensembles

python ensembles

If you’re looking for the optimal prediction performance while tackling a ML problem, you need to consider ensemble methods. If you haven’t had the chance to use them in the past, follow this great python tutorial!

Read Introduction to Python Ensembles — from Sebastian Flennerhag

Thanks to Nicolas Jean, Alexandre Sapet, Olivier Chancé, and Flavian Hautbois. 

birds love

3 Steps to Improve the Data Quality of a Data lake

From Customising Logs in the Code to Monitoring in Kibana

graph

Keras Tutorial: Content Based Image Retrieval Using a Denoising Autoencoder

How to find similar images thanks to Convolutional Denoising Autoencoder.

man ready to sprint

Surgical Time Tracking in Python

How to profile your python code to improve performance