Back To Schedule
Saturday, November 21 • 3:00pm - 3:45pm
Performant data processing with PySpark, SparkR and DataFrame API

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

One of the great features of Apache Spark is to allow you to write code in popular languages in data processing field, Python and R. But you could face poor performance when using Spark from those language for some reason. 

Since Apache Spark 1.4, using DataFrame API would give you almost same performance as when using JVM languages like Scala or Java.

In this talk, I will show the background, examples, pitfalls of DataFrame API and how to get the best performance from Apache Spark from non-JVM languages.

avatar for Ryuji Tamagawa

Ryuji Tamagawa

Evangelist, Sky
Sometimes a software developer, sometimes a trouble- shooting field engineer , working from A to Z for software products. Also working as a translator for O'Reilly Japan. Translated over 20 titles, mostly on big data, cloud computing, software development.

Saturday November 21, 2015 3:00pm - 3:45pm JST

Attendees (0)