Mar 25, 20235 min readTesting Spark Streaming locally with EmbeddedKafkaIt's been a while since my previous article in Spark/Scala series, where we ran Spark locally using Docker. And even before that we...
Dec 15, 20224 min readSpark: understanding Physical PlansYou have some kind of query - maybe it's written using Dataset API, maybe using Spark SQL. It reads from one or several Hive tables, or...
Oct 27, 20223 min readStarting up Spark Standalone Cluster with DockerIn previous post we've created a simple Spark app, and used Scalatest to check that it actually works. Even though we were creating a...
Oct 22, 20223 min readTesting Spark apps locally with ScalatestYears ago I wrote a blog post describing process of building and deploying a simple Spark app. This post is now too old to be of any use...