Pinterest is experimenting with MemSQL for real-time data analytics

Gigaom

Pinterest shed more light on how the social scrapbook and visual discovery service analyzes data in real time, it said in a blog post on Wednesday, also revealing details about how it’s exploring a combination of MemSQL and Spark Streaming to improve the process.

Currently, Pinterest uses a custom-built log-collecting agent dubbed Singer that the company attaches to all of its application servers. Singer then collects all those application log files and with the help of the real-time messaging framework Apache Kafka it can transfer that data to Storm or Spark and other “custom built log readers” that “process these events in real-time.”

Pinterest also uses its own log-persistence service called Secor to read that log data moving through Kafka and then write it to Amazon S3, after which Pinterest’s “self-serve big data platform loads the data from S3 into many different Hadoop clusters for batch processing,” the blog post…

View original post 131 more words

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.