πŸ‡¨πŸ‡΄ @guerravis
πŸ‡ΊπŸ‡Έ @duto_guerra

Opening the black box
from data to insights

John Alexis Guerra GΓ³mez


Use spacebar and the arrows to advance slides

The purpose of visualization is insight, not pictures

How to make sense of data?

  • Statistical Analysis
  • Machine Learning and Artificial Intelligence
  • Visual Analytics (and data analytics)

Why should we visualize?

x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Property Value
Mean of x 9
Variance of x 11
Mean of y 7.50
Variance of y 4.125
Correlation between x and y 0.816
Linear regression y = 3.00 + 0.500x
Coefficient of determination of the linear regression 0.67




In Infovis we look for Insights

  • Deep understanding
  • Meaningful
  • Non obvious
  • Actionable
  • Based on data

How do I do it?

What do I use?


What car should I buy?

Normal procedure

Ask friends and family

Renault 4
Renault 4 JP4
Teilgefalteter Renault 4 am Strassenrand


That's inferring statistics from a sample n=1

Better approach

Data based decisions

Screenshot Tucarro.com

Social Networks

Twitter election analysis

Presidential Election


How can you do it?

You get a new dataset

What do you do with it?

Mexican Data πŸ‡²πŸ‡½

Mexican Budget Open Data Site


Jupiter Notebook/R?

Jupiter Notebook demo


Voyager2 with 50MB

Voyager2 with MoMa Collection

Parallel Coordinates?

Parallel Coordinates Fifa Dataset


Scatterplot Matrix NBA Dataset


Navio thumb

A viz Widget

const nv = navio(d3.select("#navio"), 600);

nv.updateCallback( sel => doSmthng(sel) );


  • Usability study
  • Data Scientists Experiment
  • Domain Experts Validation

Usability Study

MoMa Original search interface

MoMa Original Search interface

Usability Study

  • 9 participants
  • 2 UIs
  • Errors/Time + User satisfaction
Navio Usability Study results
Navio Usability Study errors/time results

Insight based experiments

Data Scientists

  • 4 participants
  • Exploring their own data
  • Discover insights

Political scientists

  • 6 political experts
  • Exploring their own data
  • Discover insights
Navio Usability Study results

Spinoff projects


Configure and setup Navio

Juan Guillermo Murillo


Backend for scaling up Navio

Juan Camilo Ortiz

TADAVA Architecture

TADAVA Architecture


Can we use Navio with Voyager?

Lady PinzΓ³n

Stand alone Shipyard

Can we use more resources for Shipyard running locally?

Felipe Sabogal
Standalone shipyard


2 Msc, 3 Undergrads

Free (as in Libre) Software

Do you ML?

Multivariate Data?-> Dimensionality Reduction + Clustering


FabiΓ‘n PeΓ±a

Opening the black box

Rappi on Twitter

  • 30k tweets in the last 7 days

It's up to you!

  • Interactivity πŸ‘‰ Ask questions
  • Slice and dice
  • Overview first, Zoom/Filter, then details on demand

Rappi Dashboard Link πŸ˜‰

πŸ˜‘πŸ˜ πŸ˜’πŸ˜πŸ˜πŸ˜ƒπŸ₯°?

  • Machine learning 🎩! ???
  • Detects sentiment ! ???

I hired a data πŸ’ (might be me)

Analyze 180 tweets

  • πŸ˜‘πŸ˜ πŸ˜’πŸ˜πŸ˜πŸ˜ƒπŸ₯°

Here are some of them

Rappi tweet
😐 -10%
Rappi tweet
😑 -80%
Rappi tweet
πŸ₯° 80%
Rappi tweet
😐 -10%
Rappi tweet
😐 -20%
Rappi tweet
πŸ₯° 90%
Rappi tweet
πŸ˜’ -40%
Rappi tweet
πŸ˜’ -30%

Would you hire this data πŸ’?

Well.... actually

  • It wasn't a data πŸ’
  • It was a πŸ’»
  • Would you use it?


Rappi tweet
😠 -50%
Rappi tweet
😐 -10%
Rappi tweet
😠 -60%
Rappi tweet
😠 -50%
Rappi tweet
😑 -70%
Rappi tweet
😑 -80%
Rappi tweet
😑 -80%
Rappi tweet
😑 -70%

Well.... actually

Will you trust it?

I don't

Β‘No coma Machine Learning, coma πŸ–!


Wingz and Beer logo

Take home messages

Focus on insights!!!

We need more open data!

Colombian Highschools

How can I get Insights too?

No need to wait for Stanford, MIT or Berkeley to help you

IMAGINE Research Group

  • Visual Analytics
  • Virtual/Augmented Reality
  • Visual Computing
  • Mobile Robotics
  • Machine Learning
Imagine Reel


  • πŸ‘‰πŸΌ Insights! πŸ‘ˆπŸΌ
  • Open data and share
  • Ask for infovis
  • Evaluate/Explain your models

John Alexis Guerra GΓ³mez



Big Data?

You might have heard of the Vs of Big Data

  • Volume
  • Velocity
  • Variety
  • and Veracity and Value

Too ambiguous!! πŸ€¦πŸ½β€β™€οΈ Let's go beyond that

How Big is big?

Can you fit it in one computer?

Yes? πŸ‘‰πŸΌ Then, is not really big πŸ€·πŸ½β€β™€οΈ

Why this criteria?

Big data πŸ‘‰πŸΌ Big overhead

Example: photo collection

  • One photo πŸ‘‰πŸΌ 10MB
  • 1k photos in a πŸ“± πŸ‘‰πŸΌ 10MB * 1k = 10000MB = 10GB
  • 50k photos in your πŸ’» πŸ‘‰πŸΌ 10MB * 50k = 500GB

Big Data? πŸ™…πŸ½β€β™‚οΈ

How many blue photos are in my collection?

How do you compute this?

  • Put all your photos in one πŸ’»
  • Go through all the collection and count the blue ones

Flickr scale

80+ trillion photos (80'''000''000'000.000)

That's big data

How many blue photos are on Flickr?

How do you compute this?

  • Distribute the data among 100s of πŸ’»πŸ’»πŸ’»s. (a cluster)
  • Compute subtotals on each data part. (Map)
  • Aggregate the subtotals into one big total. (Reduce)

How many computers do you need?

What if one computer breaks? ☒️


Big Data? πŸ‘‰πŸΌ Only if it doesn't fit on one πŸ’»

⚠️ Use it only if you must ⚠️

But don't panic!

Let me share a secret


My wife tells it to me all the time!

Size doesn't really matter

What matters are the insights πŸ‘

Insights ?

Making Sense of Data

Anti-corruption referendum

What about the oposition?


Other Insights


Task: Change in drug's adverse effects reports

User: FDA Analysts

State of the art


Health insurance claims

Task: Detect fraud networks

User: Undisclosed Analysts



Ego distance

My Facebook


Types of Visualization

  • Infographics
  • Scientific Visualization (sciviz)
  • Information Visualization (infovis, datavis)


Scientific Visualization

  • Inherently spatial
  • 2D and 3D

Information Visualization

Infovis Basics

Visualization Mantra

  • Overview first
  • Zoom and Filter
  • Details on Demand

Data Types

1-D LinearDocument Lens, SeeSoft, Info Mural
2-D MapGIS, ArcView, PageMaker, Medical imagery
3-D WorldCAD, Medical, Molecules, Architecture
Multi-VarSpotfire, Tableau, GGobi, TableLens, ParCoords,
TemporalLifeLines, TimeSearcher, Palantir, DataMontage, LifeFlow
TreeCone/Cam/Hyperbolic, SpaceTree, Treemap, Treeversity
NetworkGephi, NodeXL, Sigmajs