Pinterest is experimenting with MemSQL for real-time data analytics

Gigaom

Pinterest shed more light on how the social scrapbook and visual discovery service analyzes data in real time, it said in a blog post on Wednesday, also revealing details about how it’s exploring a combination of MemSQL and Spark Streaming to improve the process.

Currently, Pinterest uses a custom-built log-collecting agent dubbed Singer that the company attaches to all of its application servers. Singer then collects all those application log files and with the help of the real-time messaging framework Apache Kafka it can transfer that data to Storm or Spark and other “custom built log readers” that “process these events in real-time.”

Pinterest also uses its own log-persistence service called Secor to read that log data moving through Kafka and then write it to Amazon S3, after which Pinterest’s “self-serve big data platform loads the data from S3 into many different Hadoop clusters for batch processing,” the blog post…

View original post 131 more words

This entry was posted in Brian By Experience. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.