Scenarios
Hands-on labs to practice your Linux, Docker & Kubernetes troubleshooting skills.
New here?Start with these free scenarios. No signup required.
New scenarios
A new GPU-enabled Kubernetes cluster has just gone live.
The platform team shared one simple instruction: Add gpu=enabled to schedule workloads on nodes with GPUs.
A teammate deployed an ML inference application called myapp in the apps namespace.
The deployment has an affinity defined to only schedule pods on nodes with the label gpu=enabled, but all pods appear as Pending.
You can find the deployment definition in /home/lab/manifest.yaml.
A critical internal application has been deployed to a PCI-regulated Kubernetes cluster.
Unlike the other Kubernetes clusters, this one enforces a stricter set of security and compliance controls.
The deployment process completed, but the application is not running.
The application is called myapp and it is deployed in the apps namespace.
You can find the deployment definition in /home/lab/deploy.yaml.
All scenarios
This is a playground with no tasks. Use it to try things out and get familiar with the platform before taking on the real challenges.
The configuration for the process myapp needs to be reloaded, but it cannot be stopped or restarted.
Luckily, myapp has a man page with detailed information about how to do it.
An issue has been reported in production with a process called myapp.
You've been given access to the Linux server running it and you've been asked to investigate it.
The process is running, but its logging configuration is not documented.
While reviewing logs, you came across a log file you don't recognize: /var/log/unknown.log.
There's no documentation mentioning which service or process is responsible for it.
The service myapp fails to start because another process is already listening on 127.0.0.1:8080, which is where myapp is configured to listen on.
Since myapp is a binary and its source code cannot be modified, you have been asked to resolve the conflict so that myapp can start successfully.
The application myapp generates an excessive amount of log output by default.
You have been tasked with reducing the log verbosity without modifying the program.
A newer version of the application myapp, version 2.0.0, has been installed in /home/lab/.local/bin.
However, when you execute myapp from the terminal without specifying the absolute path, the system runs version 1.0.0 instead.
An application named myapp has been installed in the directory /opt/bin.
However, when you try to execute it by typing myapp in the terminal, the command is not recognized and the program doesn't run.
A program called myapp is installed in the system.
It writes all error messages to STDERR and all other output to STDOUT.
Since it's a binary, you cannot modify its source code to change its logging behavior.
A program called myapp is used to generate operational reports about scheduled tasks.
By design, the application sends INFO messages to STDOUT and ERROR messages to STDERR.
Historically, all generated reports have been persisted in the log file /home/lab/long_report.log.
You're troubleshooting a Kubernetes deployment where an application is failing to authenticate using a configured Secret.
Other Secrets in the same environment work correctly, indicating the issue is isolated to this specific secret rather than the application code.
The original password can be found in /home/lab/password.txt
The base64-encoded secret can be found in /home/lab/secret.txt
This scenario is just for fun.
It was inspired by the "How to exit vi" memes.
It's presented as a tutorial instead of as a real challenge to show how to use signals to exit a program.
Imagine you don't know how to exit vi properly.
You're working on a system where an application called myapp has been deployed.
The binary and the configuration file were copied to the directory /opt/myapp, but the application fails to start.
The logs indicate that the application is unable to find its configuration file, but there is no documentation on where the configuration file should be located.
You're troubleshooting a service startup issue on a Linux host.
The program myapp fails to start, returning a generic error indicating that the address is already in use.
There's no documentation for the service and the program doesn't log any useful information about which port it expects to listen on.
You've been asked to address excessive logging in containers created from the image scenario15:1.0.0.
By default, these containers generate a high volume of log output, creating noise and wasting resources.
You're managing a running Docker container named myapp that hosts a long-lived application process.
A recent configuration change needs to be applied without any downtime, therefore restarting or stopping the container are not valid options.
The program running inside the container handles the following signals:
SIGHUP: Reloads its configuration.SIGUSR1: Shows its current configuration.
You've just joined a team preparing a new Go service for production.
A concern has come up during a pre-release review: the current Docker image, tagged myapp:1.0, has a DISK USAGE over 1 GB.
This is above the acceptable threshold for an image used in production.
As part of the optimization effort, you've been tasked with reducing the size of the Docker image.
The source code of the service can be found in the directory /home/lab/myapp .
Base images pre-approved for production are already present in the environment.
You're working with a containerized Go application built as the Docker image myapp:1.0.
When you attempt to run it, the container fails to start, even though a teammate confirms it runs successfully on their machine.
The image was built from the project located at /home/lab/myapp .
You've been asked to troubleshoot a local development environment for a service called myapp.
This service has been provisioned using Docker Compose in the directory /home/lab/myapp.
Running docker compose up should start the service successfully, but it's not working as expected.
You're investigating a failure in a containerized application stack.
The myapp container hosts a web application that depends on a PostgreSQL database running in a separate container named db.
At the moment, the myapp container is unable to start because it cannot establish a connection to its database.
After a sudden reorganization, your team has just inherited a legacy Docker image with no handover or documentation. The container starts when you run the image, but no ports are exposed, and the logs are unhelpful. Right now, the application running inside the container is completely inaccessible. Find on which port the application is listening and expose it.
A developer on your team proposes to add an API key during the Docker build process. Their plan: copy a file containing the key into the image, use it during the build, and then delete it in a later step. Since that file won't be present in the final Docker image, they consider it safe. You're not convinced. All layers of a Docker image are kept and anyone with access to the image will be able to extract the API key.
A new marketing campaign is expected to drive a sharp increase in traffic to your Kubernetes cluster.
A legacy third-party pod named myapp in the apps namespace may not have sufficient CPU resources to handle the load.
This app is critical to the business and cannot be restarted.
You tried to modify it with kubectl edit, but you got the classic error:
Forbidden: pod updates may not change fields other than ...