Best Of AI May 2019
June 12, 2019

The Best of AI: New Articles Published This Month (May 2019)

10 data articles handpicked by the Sicara team, just for you

Welcome to the May edition of our best and favorite articles in AI that were published this month. We are a Paris-based company that does Agile data development. This month, we spotted articles about ICLR 2019, image processing in Python, and other hot topics. We advise you to have a Python environment ready if you want to follow some tutorials :). Let’s kick off with the comic of the month:

It’s basically Pascal’s Wager for the paranoid prankster
It’s basically Pascal’s Wager for the paranoid prankster

1 — Top 8 trends from ICLR 2019

samples from BigVAN model

We could not have written this Best of AI without mentioning the 7th edition of the International Conference on Learning Representation(ICLR) which took place in New Orleans at the beginning of the month. Did not get the chance to go to this top-notch research event? No worries! We came across a summary of the conference by a researcher from NVIDIA labs. This technical article will give you a great overview of some of the hottest keywords in AI research. Look them up if you do not get them all!

Read Top 8 trends from ICLR 2019 — from Chip Huyen

2 — Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask


Each year at ICLR, some papers are awarded the prize of best papers of the conference. One of the two winners of this edition comes from the research labs of the Massachusetts Institute of Technology (MIT)The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks,from Jonathan Frankle and Michael Carbin. This excellent paper introduces and proves the lottery ticket hypothesis, which states that dense randomly initialized networks can be pruned to much smaller sub-networks with almost no loss in its accuracy. It also gives an algorithm to retrieve these “winning tickets”.

Read Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask — on Uber Engineering Blog

3 — Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks


The second winner article of ICLR 2019 is Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks, from the Montreal Institute for Learning Algorithms (MILA) and the Microsoft Research Montréal lab. The authors introduce a novel deep learning architecture, called the ON-LSTM, for Ordered Neurons-LSTM. It aims at explicitly modelling hierarchical structure in Long Short-Term Memory (LSTM) networks. This architecture achieves good performance on various Natural Language Processing tasks, such as language modelling, unsupervised parsing, targeted syntactic evaluation, and logical inference.

Read Brief Report: Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks — on Microsoft Research Blog

4 — Introducing Translatotron: An End-to-End Speech-to-Speech Translation Model

Google AI

Speech-to-speech translation systems aim at translating oral statements from one language to another. It is usually performed through an intermediate text transcription of the input signal. These state-of-the-art cascade translators are the very same that are used by popular products such as Google Translate but could be replaced in the near future by an end-to-end method that does not leverage textual translation. The latter comes with promising improvements over the classical speech-to-speech translation systems, such as faster translation, and its ability to retain the original speaker voice characteristics in the translated speech.

Read Introducing Translatotron: An End-to-End Speech-to-Speech Translation Model — on Google AI Blog

5 — Open-sourcing Ax and BoTorch: New AI Tools for Adaptive Experimentation

Ax and BoTorch

Exploring the parameter space of a Machine Learning model is one of the major time-consuming activity of a Data Scientist. At the beginning of the month was held Facebook 2019 F8 developer conference. For the occasion, the company announced the open-sourcing of two optimization tools it had been using for some time already: Ax and BoTorch. The first one enables to automate and monitor adaptive experiments. The second one is built on PyTorch and provides a framework for Bayesian optimization. They can be paired to conduct automated hyper-parameters optimization of your machine learning model.

Read Open-sourcing Ax and BoTorch: New AI tools for adaptive experimentation — on Facebook AI Blog

6– 10 Python Image Manipulation Tools

Python Image Manipulation Tools

If there is one thing we learned across our computer vision projects at Sicara, it is that good performance cannot be achieved without well-designed processing of the images fed to our models. We know it may be tough to find one’s way among the many existing tools and multi-purposed libraries. So we thought a quick overview of image manipulation tools in Python could be handy. From staples of DataScience toolkits, such as Numpy, OpenCV, and Scikit-image to more advanced or specific libraries such as SimpleITK, do not fear to spend some time exploring the vast possibilities of those libraries.

Read 10 Python image manipulation tools — from Parul Pandey

7 — Moving Camera, Moving People: A Deep Learning Approach to Depth Prediction

depth prediction

How can a short but viral social network trend fuel scientific research in AI? The Mannequin Challenge, in which people from all over the world filmed themselves frozen in actions while the camera tours the scene, flooded the web from the end of 2016 to the beginning of 2017. This huge video database was used as training data for the deep networks of Google AI researchers. The aim was to infer the depth of scenes with cameras and moving people, a task that is not easily done with classical methods, based on triangulation. The compelling results pave the way for cool applications, such as stereo video generation out of monocular one, and production of a range of 3D-aware video effects.

Read Moving Camera, Moving People: A Deep Learning Approach to Depth Prediction — from Google AI Blog

8 — The Definite Guide For Creating An Academic-Level Dataset With Industry Requirements And Constraints


Generating a proper dataset for an industrial machine learning project can be cumbersome especially when it comes to labelling its samples. Zencity is a company that leverages data and machine learning to assist local governments in their decision making, mainly based on their citizens’ feedback. They needed a massive dataset for one of their research projects in Sentiment Analysis. One of their lead data scientists explains how they managed to build it, from the various annotation tools explored, to the sampling technique employed. Even though some points are specific to the task at stake here, the global approach can be adapted with benefit.

Read The Definite Guide For Creating An Academic-Level Dataset With Industry Requirements And Constraints — from Ori Cohen

9 — Creating Personalized Photo-Realistic Talking Head Models


Following the lead of DeepFake and OpenAI GPT-2, a new paper released this month raises threat by promising an easier generation of fake news and misinformation. This article jointly published by Samsung AI Center, and Skolkovo Institute of Science and Technology in Moscow, is far from being the first attempt to apply deep learning to the animation of faces in pictures to create fake animated sequences. It stands out by its ability to do so with only a few (if not one) samples of the face to be animated, by training its adversarial network with few-shot learning.

Read Creating Personalized Photo-Realistic Talking Head Models — from Christopher Dossman

10 — The Best and Most Current of Modern Natural Language Processing


If you are getting into Natural Language Processing and want to get a good grip at the latest academic research in the field, this article is the perfect fit for you to populate your reading list. It lacks a few introduction references for newbies but comes with popular course materials, podcasts, and interesting blogs if you want to catch up on the subject.

Read The Best and Most Current of Modern Natural Language Processing — from Victor Sanh

Thanks to Hugo Lime, Florian Carra, Geoffroy de Boissieu, and Antoine Moreau. 

birds love

3 Steps to Improve the Data Quality of a Data lake

From Customising Logs in the Code to Monitoring in Kibana

fig. 1: Screenshot of my React app using the neural networks computed here.

Introduction to Deep Q-learning with SynapticJS & ConvNetJS

An application to the Connect 4 game.

mac mac

5 Mistakes I Made When Doing Custom Data Visualization With D3.js

Experience feedback from drawing graphs in Javascript with D3.js