Building a real-time news analytics

It is always interesting to find out what exactly is happening around the world as they are happening. This feature of knowing something as it happens is so called “realtime” or “near realtime” because of the network latency, delay for processing the data, delay for visualizing and streaming the data, etc.

Now imagine you can monitor most international news agencies on Twitter in real time. What I have been developing for the last few months is an application that allows you to not only get important news and highlights around the world as they are happening but also persists the data for the window of 24 hours so you can always have the ability to read important news and highlights that have already happened.

Continue reading “Building a real-time news analytics”

Building Search Engine by ElasticSearch

I’ve started to work with ElasticSearch for a while now. I gotta say it’s a powerful open source for building distributed real-time search engines and analytics engines.

It also uses Shards and Replica in distributed machines to make your architecture reliable and scalable. It’s not just a simple full-text search engine even though it does that perfectly.

What I do with ElasticSearch is that it’s connected to my MongoDB replica set and it indexes tweets as they are streamed by Twitter API. I track tweets for few projects (health, news, UN GlobalPulse, etc.) and try to index some of my projects in ES on the fly.

Continue reading “Building Search Engine by ElasticSearch”