python

PySpark seemingly allows Python code to run on Apache Spark - a JVM based computing framework. How is this possible? I recently needed to answer this question and although the PySpark API itself is well documented, there is little in-depth information on its implementation. This article contains my findings from diving into the Spark source code to find out what’s really going. Spark vs PySpark For the purposes of this article, Spark refers to the Spark JVM implementation as a whole.

Read more…

Even Santa has to deal with software licensing and non standard serialisation formats.

Read more…

Apparently elves also aren’t great at security.

Read more…