How to convert tables from PDF to Excel or CSV with Tabula

convert pdf into csv and extract data from tables

One of the most laborious tasks in Machine Learning consists of data collection and treatment.

There are a meteorological observatory in my city. You can see main meteorological indicators in real time trough its we and it share historical data too, but it share it in PDF

I have talk with them in order to share all data in CSV and allow people to use the data easily, but it seems that is not possible ­čÖü

Therefore I want this data and I want to convert this PDF files to a workable data collection. And I have been searching a good solution to convert this table PDF to CSV and the solution is called Canvas.

Once you have data in CSV you can use this data in many ways, opening with excel, Libre office, Google Sheets, etc, because is easy import them in spreedsheets or using pythons and its libraries.

As I want an automatized process I will work with a python script and is here where I introduce Tabula.

Read moreHow to convert tables from PDF to Excel or CSV with Tabula

How to Install and manage Anaconda

Anaconda Data Science, big data & pytho, R disribuci├│n

This article is an Anaconda installation guide and also a guide of its packages manager, Conda. With this technology we will be able to create development environment for Python and R with the libraries we prefer. It is very interesting to begin learn Machine Learning, data analysis and programming with Python.

Anaconda is a Free and Open Source distribution for Python and R languages. It is very used in Data Science, Machine Learning, Science, Engineering, predictive analytics,Big Data, etc

Installing Anaconda we can use a great quantity of packages. There are more than 1400 of the most known applications and software. Some examples are:

  • Jupyter Notebook
  • Numpy
  • Pandas
  • Tensorflow
  • H20.ai
  • Scipy
  • Jupyter
  • Dask
  • OpenCV
  • MatplotLib
  • Scarapy

Read moreHow to Install and manage Anaconda

Coursera Machine Learning course Review

Last year I finished Machine Learning Coursera course by Stanford University and Andrews Ng

Is a free course about Machine Learning and a little of Deep Learning created by Andrew Ng and Stanford University. Although it’s free you can to purchase a certificate by $70.  It is divided into 3 basis , videos, quizzies and programation excercises.

You see the videos, do a quizz and a practice exercices, designing a part of an algorithm and implementing and testing it with Matlab or Octave

Read moreCoursera Machine Learning course Review

Courses to learn Machine Learning, Deep Learning and AI

courses about machine learning, deep learning. The importance of Data

These are good resources I have found to learn yourself Machine Learning, Deep Learning and other AI subjects.

There are free and paid courses and different levels.

Free Courses

Begginer

Divided into short courses (from 1 to 20 hours ). These are courses to take a first contact with the subject

Read moreCourses to learn Machine Learning, Deep Learning and AI