top of page
Chashnikov.dev

Feb 18, 20245 min read
Navigating Data Management: Warehouses, Lakes and Lakehouses
In today’s dynamic data management landscape, the terminology and concepts related to data storage and processing have become more...
29 views
0 comments


Jan 26, 20245 min read
One Billion Row Challenge - view from sidelines
In the last couple of days I’ve been hearing, reading and poking around the 1 Billion Row Challenge (1BRC) - a ”contest” for Java / JVM...
18,302 views
0 comments

Jun 12, 20235 min read
Inverted Indexes: A Step-by-Step Implementation Guide
Inverted Indexes: why do you need one, and how to implement in Scala quickly and easily
11,319 views
1 comment

May 29, 20236 min read
The Do's and Don'ts of Apache Spark - Best Practices for Efficient Data Processing
Apache Spark has emerged as one of the most popular big data processing frameworks due to its speed, scalability, and ease of use....
14,023 views
0 comments

Apr 22, 20235 min read
Testing Spark StructuredStreaming locally with EmbeddedKafka - part 2, now with objects
This is a continuation of "Testing Spark Streaming locally with EmbeddedKafka". If you're not familar with EmbeddedKafka - I'd recommend...
3,162 views
0 comments

Mar 25, 20235 min read
Testing Spark Streaming locally with EmbeddedKafka
It's been a while since my previous article in Spark/Scala series, where we ran Spark locally using Docker. And even before that we...
899 views
1 comment


Jan 22, 20232 min read
Evaluating management
Performance reviews season being in full swing, got me thinking about role of managers in it, expectations from Individual Contributor's...
285 views
0 comments

Dec 15, 20224 min read
Spark: understanding Physical Plans
You have some kind of query - maybe it's written using Dataset API, maybe using Spark SQL. It reads from one or several Hive tables, or...
10,451 views
0 comments


Nov 14, 202214 min read
Database Internals: A very short conspect
About a year ago I listened to Software Engineering Radio podcast episode with Alex Petrov, where Alex was discussing his new book,...
5,251 views
0 comments


Oct 27, 20223 min read
Starting up Spark Standalone Cluster with Docker
In previous post we've created a simple Spark app, and used Scalatest to check that it actually works. Even though we were creating a...
4,866 views
0 comments


Oct 22, 20223 min read
Testing Spark apps locally with Scalatest
Years ago I wrote a blog post describing process of building and deploying a simple Spark app. This post is now too old to be of any use...
10,517 views
2 comments
bottom of page