Welcome to the 1st edition of the year of our best and favorite articles in AI that were published this month. We are a Paris-based company that does Agile data development. This month, we spotted articles about Natural language processing, MNIST, Blockchain, ensemble methods and more. We advise you to have a Python environment ready if you want to follow some tutorials :). Let’s kick off with the comic of the month:
To start, an article on the differences between AI, Datascience and Machine learning. The main idea I keep from this article is that “artificial intelligence produces actions”. This made me recall a TED talk I watched a few years back. A scientist explains how a certain sea squirts uses its nervous system to move around at the beginning of its life. And once it has settled on a rock, it doesn’t need its brain anymore, so it just eats it … Food for the mind !
This article makes a pass on ten common machine learning algorithms. The author doesn’t go into detail and I was already familiar with most of them, but it made me discover the learning vector quantization algorithm so maybe you’ll learn something new too.
The next one really talked to me: it’s about exploration of the handwritten digits MNIST dataset. I had the occasion of benchmarking some models on this dataset, and this post really made the data talk. I really enjoyed the simple steps taken by the author in order to detect where the ML algo would encounter difficulties even before applying any model.
I don’t know if you remember the paper that claimed using deep learning to identify if you were gay or straight from your facial features? This article written by research scientists from Google AI Group analyses the dataset used in this study and the conclusions are extremely insteresting! It really made me think about the way I interpret my ML results, and how easy it is to have a biased dataset.
This one may be my favorite this month, although I may be biased because it’s written by my engineering school flatmate :). It takes you progressively around the major NLP algorithms and explains how to interpret your models’ results.
By the way, we’re preparing a blog article on training Named Entity Recognition on foreign languages. Don’t forget to follow us!
The last couple of months have been tumultuous month in the cryptocurrency world, and a lot of questions were raised concerning the blockchain. The author shares his views on the possible intersections of AI and Blockchain, and he raises some intriguing points about data marketization and decentralized intelligence. What if all the energy used by miners were used to train ML algorithm instead of “dummily” computing hashes? Makes you dream no?
By the way if the blockchain concept is still a little bit obscure to you, check this really cool article where the author codes a blockchain in python from scratch.
A lot of great articles have been published on Apha Go and Alpha Go Zero in the past months, but none of them gave me as many insights as this one. The author implements the Alpha Zero model on the game of connect4, and as always concrete code example is worth a thousand word!
Have you heard about quantum computing? What about quantum machine learning? If not don’t worry this article gives you a little refresher and discusses what quantum computing could bring to ML and vice versa.
Python 2 is dead, long live Python 3! Well not quite yet but most of the data science tools (pandas, numpy… ) will stop supporting python 2 in 2020. It’s a good time to have a pass on all the differences between the two. I even learned some tricks that I will definitely use from now on!
If you’re looking for the optimal prediction performance while tackling a ML problem, you need to consider ensemble methods. If you haven’t had the chance to use them in the past, follow this great python tutorial!
3 Steps to Improve the Data Quality of a Data lake
From Customising Logs in the Code to Monitoring in Kibana
Keras Tutorial: Content Based Image Retrieval Using a Denoising Autoencoder
How to find similar images thanks to Convolutional Denoising Autoencoder.
Surgical Time Tracking in Python
How to profile your python code to improve performance