top of page
Chashnikov.dev

Navigating Data Management: Warehouses, Lakes and Lakehouses
In today’s dynamic data management landscape, the terminology and concepts related to data storage and processing have become more...
Feb 18, 20245 min read
29 views
0 comments


One Billion Row Challenge - view from sidelines
In the last couple of days I’ve been hearing, reading and poking around the 1 Billion Row Challenge (1BRC) - a ”contest” for Java / JVM...
Jan 27, 20245 min read
18,313 views
0 comments

Inverted Indexes: A Step-by-Step Implementation Guide
Inverted Indexes: why do you need one, and how to implement in Scala quickly and easily
Jun 12, 20235 min read
11,349 views
1 comment

The Do's and Don'ts of Apache Spark - Best Practices for Efficient Data Processing
Apache Spark has emerged as one of the most popular big data processing frameworks due to its speed, scalability, and ease of use....
May 29, 20236 min read
14,073 views
0 comments

Testing Spark StructuredStreaming locally with EmbeddedKafka - part 2, now with objects
This is a continuation of "Testing Spark Streaming locally with EmbeddedKafka". If you're not familar with EmbeddedKafka - I'd recommend...
Apr 22, 20235 min read
3,165 views
0 comments

Testing Spark Streaming locally with EmbeddedKafka
It's been a while since my previous article in Spark/Scala series, where we ran Spark locally using Docker. And even before that we...
Mar 25, 20235 min read
916 views
1 comment


Evaluating management
Performance reviews season being in full swing, got me thinking about role of managers in it, expectations from Individual Contributor's...
Jan 22, 20232 min read
285 views
0 comments

Spark: understanding Physical Plans
You have some kind of query - maybe it's written using Dataset API, maybe using Spark SQL. It reads from one or several Hive tables, or...
Dec 15, 20224 min read
10,506 views
0 comments


Database Internals: A very short conspect
About a year ago I listened to Software Engineering Radio podcast episode with Alex Petrov, where Alex was discussing his new book,...
Nov 14, 202214 min read
5,272 views
0 comments


Starting up Spark Standalone Cluster with Docker
In previous post we've created a simple Spark app, and used Scalatest to check that it actually works. Even though we were creating a...
Oct 27, 20223 min read
4,916 views
0 comments


Testing Spark apps locally with Scalatest
Years ago I wrote a blog post describing process of building and deploying a simple Spark app. This post is now too old to be of any use...
Oct 22, 20223 min read
10,635 views
2 comments
bottom of page