top of page
Chashnikov.dev
Feb 18, 20245 min read
Navigating Data Management: Warehouses, Lakes and Lakehouses
In today’s dynamic data management landscape, the terminology and concepts related to data storage and processing have become more...
29 views0 comments
Jan 26, 20245 min read
One Billion Row Challenge - view from sidelines
In the last couple of days I’ve been hearing, reading and poking around the 1 Billion Row Challenge (1BRC) - a ”contest” for Java / JVM...
18,277 views0 comments
Jun 12, 20235 min read
Inverted Indexes: A Step-by-Step Implementation Guide
Inverted Indexes: why do you need one, and how to implement in Scala quickly and easily
11,271 views1 comment
May 29, 20236 min read
The Do's and Don'ts of Apache Spark - Best Practices for Efficient Data Processing
Apache Spark has emerged as one of the most popular big data processing frameworks due to its speed, scalability, and ease of use....
13,969 views0 comments
Apr 22, 20235 min read
Testing Spark StructuredStreaming locally with EmbeddedKafka - part 2, now with objects
This is a continuation of "Testing Spark Streaming locally with EmbeddedKafka". If you're not familar with EmbeddedKafka - I'd recommend...
3,157 views0 comments
Mar 25, 20235 min read
Testing Spark Streaming locally with EmbeddedKafka
It's been a while since my previous article in Spark/Scala series, where we ran Spark locally using Docker. And even before that we...
867 views1 comment
Jan 22, 20232 min read
Evaluating management
Performance reviews season being in full swing, got me thinking about role of managers in it, expectations from Individual Contributor's...
285 views0 comments
Dec 15, 20224 min read
Spark: understanding Physical Plans
You have some kind of query - maybe it's written using Dataset API, maybe using Spark SQL. It reads from one or several Hive tables, or...
10,351 views0 comments
Nov 14, 202214 min read
Database Internals: A very short conspect
About a year ago I listened to Software Engineering Radio podcast episode with Alex Petrov, where Alex was discussing his new book,...
5,196 views0 comments
Oct 27, 20223 min read
Starting up Spark Standalone Cluster with Docker
In previous post we've created a simple Spark app, and used Scalatest to check that it actually works. Even though we were creating a...
4,823 views0 comments
Oct 22, 20223 min read
Testing Spark apps locally with Scalatest
Years ago I wrote a blog post describing process of building and deploying a simple Spark app. This post is now too old to be of any use...
10,377 views2 comments
bottom of page