twitter Archives - Maziyar PANAHI

December 3, 2015

Big Data: HOW TO SCALE FROM ZERO TO BILLIONS!

How Big Data platform scaled from zero to billions of data within 6 months at ISCPIF (CNRS).

This talk contains our use of Elasticsearch, MongoDB, Redis, RabbitMQ and scalable/high available Web services built over Big Data architecture.

This presentation was presented at Université Paris-Sud, LAL, Bâtiment 200 organized by ARGOS. https://indico.mathrice.fr/event/2/overview

ISCPIF: http://iscpif.fr
Big Data at ISCPIF: http://bigdata.iscpif.fr
Climate at ISCPIF: http://climate.iscpif.fr
Playground for climate: http://climate.iscpif.fr/playground
Tweetoscope: http://tweetoscope.iscpif.fr

November 15, 2015December 10, 2015

Paris before and during terrorist attacks

Night of terror

The data from Twitter shows some upsetting stats about Paris terrorist attacks on 13 November.

As it can be seen, the query for terrorist attacks has no result before 22h30. Unfortunately this shows even without any knowledge of the event itself (media or news), it is indeed possible to assume something must have happened related to the requested queries.

Continue reading “Paris before and during terrorist attacks”

July 25, 2014March 14, 2016

Building a real-time news analytics

It is always interesting to find out what exactly is happening around the world as they are happening. This feature of knowing something as it happens is so called “realtime” or “near realtime” because of the network latency, delay for processing the data, delay for visualizing and streaming the data, etc.

Now imagine you can monitor most international news agencies on Twitter in real time. What I have been developing for the last few months is an application that allows you to not only get important news and highlights around the world as they are happening but also persists the data for the window of 24 hours so you can always have the ability to read important news and highlights that have already happened.

Continue reading “Building a real-time news analytics”

March 11, 2014November 16, 2015

Some of the stats during Oscars 2014

Continue reading “Some of the stats during Oscars 2014”

February 17, 2014November 16, 2015

Building Search Engine by ElasticSearch

I’ve started to work with ElasticSearch for a while now. I gotta say it’s a powerful open source for building distributed real-time search engines and analytics engines.

It also uses Shards and Replica in distributed machines to make your architecture reliable and scalable. It’s not just a simple full-text search engine even though it does that perfectly.

What I do with ElasticSearch is that it’s connected to my MongoDB replica set and it indexes tweets as they are streamed by Twitter API. I track tweets for few projects (health, news, UN GlobalPulse, etc.) and try to index some of my projects in ES on the fly.

Continue reading “Building Search Engine by ElasticSearch”

January 29, 2014November 16, 2015

5 Minutes with Viral News on Twitter: Golden Globes, Grammy Awards and State Of The Union 2014

Our MongoDB replica set servers stats during viral tweets (GoldenGlobes, Grammys, and State Of The Union 2014)

Here is some of the popular news accounts on Twitter that can be retweeted up to 1.5 thousands within 5 minutes:
Continue reading “5 Minutes with Viral News on Twitter: Golden Globes, Grammy Awards and State Of The Union 2014”

January 13, 2014November 16, 2015

How “Golden Globe Awards” is about to break my real-time Twitter app servers on AWS!

Tweets are being retweeted more than ever when they mention who just won an award! Look at the stats from my MongoDB replica set, you’ll understand the updates have a huge pick. Also global lock went from 20% to over 60%.

Continue reading “How “Golden Globe Awards” is about to break my real-time Twitter app servers on AWS!”

November 24, 2013November 16, 2015

Source of Tweets in France and United States

So here is the thing: How people are sharing on Twitter around the world? What are the devices or services they usually use to share their check-ins, photos, videos, or updates on Twitter?

This is a really simple analytics I did on a data that I’ve been gathering for almost 2 months now (around 97 million tweets from US and 5.5 million tweets from France by the time of this study) to get some answers for the above question by using Hadoop batch processing.

I have 4 EC2 instances up and running 24×7 to track tweets (from Twitter Public Streaming API) and store them into MongoDB Replica Set. One of the nodes is an application server that I built by Node.js stack to process and visualise the stream as it comes to the system in a real-time. Currently I have average of 100 tweets/s, minimum of 30-40/s, and maximum of 180-220/s. There is more than one Twitter account at the same time to tracking tweets by locations and different keywords. That’s why I get more than 1% of the entire stream sometimes!

Continue reading “Source of Tweets in France and United States”

October 17, 2013November 16, 2015

Visualising Tweets on MapBox

There are the used technologies:

#AWS #EC2 #MongoDB #Redis #Nodejs #Socketio #Twitter #MapBox #TileMill